Clone or Download
contribute
Sync branch
林泽毅-theYeee 林泽毅 docs: del e1a4f0b 4 months ago
Cancel
Notice: Creating folder will generate an empty file .keep, because not support in Git
Loading...
README
Apache-2.0
hivision_logo

HivisionIDPhoto

English / 中文 / 日本語 / 한국어



Related Projects

  • SwanLab: Used throughout the training of the portrait matting model for analysis and monitoring, as well as collaboration with lab colleagues, significantly improving training efficiency.

Table of Contents


🤩 Recent Updates

  • Online Experience: SwanHub DemoSpaces

  • 2024.09.24: API interface adds base64 image input option | Gradio Demo adds Layout Photo Cropping Lines feature

  • 2024.09.22: Gradio Demo adds Beast Mode and DPI parameter

  • 2024.09.18: Gradio Demo adds Share Template Photos feature and American Style background option

  • 2024.09.17: Gradio Demo adds Custom Background Color-HEX Input feature | (Community Contribution) C++ Version - HivisionIDPhotos-cpp contributed by zjkhahah

  • 2024.09.16: Gradio Demo adds Face Rotation Alignment feature, custom size input supports millimeters

  • 2024.09.14: Gradio Demo adds Custom DPI feature, adds Japanese and Korean support, adds Adjust Brightness, Contrast, Sharpness feature

  • 2024.09.12: Gradio Demo adds Whitening feature | API interface adds Watermark, Set Photo KB Size, ID Photo Cropping

  • 2024.09.11: Added transparent image display and download feature to Gradio Demo.


Project Overview

🚀 Thank you for your interest in our work. You may also want to check out our other achievements in the field of image processing, feel free to reach out: zeyi.lin@swanhub.co

HivisionIDPhoto aims to develop a practical and systematic intelligent algorithm for producing ID photos.

It utilizes a comprehensive AI model workflow to recognize various user photo-taking scenarios, perform matting, and generate ID photos.

HivisionIDPhoto can achieve:

  1. Lightweight matting (purely offline, fast inference with CPU only)
  2. Generate standard ID photos and six-inch layout photos based on different size specifications
  3. Support pure offline or edge-cloud inference
  4. Beauty effects (waiting)
  5. Intelligent formal wear change (waiting)

If HivisionIDPhoto helps you, please star this repo or recommend it to your friends to solve the urgent ID photo production problem!


🏠 Community

We have shared some interesting applications and extensions of HivisionIDPhotos built by the community:

ComfyUI workflow

HivisionIDPhotos-wechat-weapp

HivisionIDPhotos-uniapp


🔧 Preparation

Environment installation and dependencies:

  • Python >= 3.7 (project primarily tested on Python 3.10)
  • OS: Linux, Windows, MacOS

1. Clone the Project

git clone https://github.com/Zeyi-Lin/HivisionIDPhotos.git
cd  HivisionIDPhotos

2. Install Dependency Environment

It is recommended to create a python3.10 virtual environment using conda, then execute the following commands

pip install -r requirements.txt
pip install -r requirements-app.txt

3. Download Weight Files

Method 1: Script Download

python scripts/download_model.py --models all

Method 2: Direct Download

Store in the project's hivision/creator/weights directory:

  • modnet_photographic_portrait_matting.onnx (24.7MB): Official weights of MODNet, download
  • hivision_modnet.onnx (24.7MB): Matting model with better adaptability for pure color background replacement, download
  • rmbg-1.4.onnx (176.2MB): Open-source matting model from BRIA AI, download and rename to rmbg-1.4.onnx
  • birefnet-v1-lite.onnx(224MB): Open-source matting model from ZhengPeng7, download and rename to birefnet-v1-lite.onnx

4. Face Detection Model Configuration (Optional)

Extended Face Detection Model Description Documentation
MTCNN Offline face detection model, high-performance CPU inference, default model, lower detection accuracy Use it directly after cloning this project
RetinaFace Offline face detection model, moderate CPU inference speed (in seconds), and high accuracy Download and place it in the hivision/creator/retinaface/weights directory
Face++ Online face detection API launched by Megvii, higher detection accuracy, official documentation Usage Documentation

5. Performance Reference

Test environment: Mac M1 Max 64GB, non-GPU acceleration, test image resolution: 512x715(1) and 764×1146(2).

Model Combination Memory Occupation Inference Time (1) Inference Time (2)
MODNet + mtcnn 410MB 0.207s 0.246s
MODNet + retinaface 405MB 0.571s 0.971s
birefnet-v1-lite + retinaface 6.20GB 7.063s 7.128s

6. GPU Inference Acceleration (Optional)

In the current version, the model that can be accelerated by NVIDIA GPUs is birefnet-v1-lite, and please ensure you have around 16GB of VRAM.

If you want to use NVIDIA GPU acceleration for inference, after ensuring you have installed CUDA and cuDNN, find the corresponding onnxruntime-gpu version to install according to the onnxruntime-gpu documentation, and find the corresponding pytorch version to install according to the pytorch official website.

# If your computer is installed with CUDA 12.x and cuDNN 8
# Installing torch is optional. If you can't configure cuDNN, try installing torch
pip install onnxruntime-gpu==1.18.0
pip install torch --index-url https://download.pytorch.org/whl/cu121

After completing the installation, call the birefnet-v1-lite model to utilize GPU acceleration for inference.

TIP: CUDA installations are backward compatible. For example, if your CUDA version is 12.6 but the highest version currently matched by torch is 12.4, it's still possible to install version 12.4 on your computer.

🚀 Run Gradio Demo

python app.py

Running the program will generate a local web page where you can perform operations and interact with ID photos.


🚀 Python Inference

Core parameters:

  • -i: Input image path
  • -o: Output image path
  • -t: Inference type, options are idphoto, human_matting, add_background, generate_layout_photos
  • --matting_model: Portrait matting model weight selection
  • --face_detect_model: Face detection model selection

More parameters can be viewed by running python inference.py --help

1. ID Photo Creation

Input 1 photo to obtain 1 standard ID photo and 1 high-definition ID photo in 4-channel transparent PNG.

python inference.py -i demo/images/test0.jpg -o ./idphoto.png --height 413 --width 295

2. Portrait Matting

Input 1 photo to obtain 1 4-channel transparent PNG.

python inference.py -t human_matting -i demo/images/test0.jpg -o ./idphoto_matting.png --matting_model hivision_modnet

3. Add Background Color to Transparent Image

Input 1 4-channel transparent PNG to obtain 1 3-channel image with added background color.

python inference.py -t add_background -i ./idphoto.png -o ./idphoto_ab.jpg -c 4f83ce -k 30 -r 1

4. Generate Six-Inch Layout Photo

Input 1 3-channel photo to obtain 1 six-inch layout photo.

python inference.py -t generate_layout_photos -i ./idphoto_ab.jpg -o ./idphoto_layout.jpg --height 413 --width 295 -k 200

5. ID Photo Cropping

Input 1 4-channel photo (the image after matting) to obtain 1 standard ID photo and 1 high-definition ID photo in 4-channel transparent PNG.

python inference.py -t idphoto_crop -i ./idphoto_matting.png -o ./idphoto_crop.png --height 413 --width 295

⚡️ Deploy API Service

Start Backend

python deploy_api.py

Request API Service

For detailed request methods, please refer to the API Documentation, which includes the following request examples:


🐳 Docker Deployment

1. Pull or Build Image

Choose one of the following methods

Method 1: Pull the latest image:

docker pull linzeyi/hivision_idphotos

Method 2: Directly build the image from Dockerfile:

After ensuring that at least one matting model weight file is placed in the hivision/creator/weights directory, execute the following in the project root directory:

docker build -t linzeyi/hivision_idphotos .

Method 3: Build using Docker Compose:

After ensuring that at least one matting model weight file is placed in the hivision/creator/weights directory, execute the following in the project root directory:

docker compose build

2. Run Services

Start Gradio Demo Service

Run the following command, and you can access it locally at http://127.0.0.1:7860.

docker run -d -p 7860:7860 linzeyi/hivision_idphotos

Start API Backend Service

docker run -d -p 8080:8080 linzeyi/hivision_idphotos python3 deploy_api.py

Start Both Services Simultaneously

docker compose up -d

Environment Variables

This project provides some additional configuration options, which can be set using environment variables:

Environment Variable Type Description Example
FACE_PLUS_API_KEY Optional This is your API key obtained from the Face++ console 7-fZStDJ····
FACE_PLUS_API_SECRET Optional Secret corresponding to the Face++ API key VTee824E····
RUN_MODE Optional Running mode, with the option of beast (beast mode). In beast mode, the face detection and matting models will not release memory, achieving faster secondary inference speeds. It is recommended to try to have at least 16GB of memory. beast

Example of using environment variables in Docker:

docker run  -d -p 7860:7860 \
    -e FACE_PLUS_API_KEY=7-fZStDJ···· \
    -e FACE_PLUS_API_SECRET=VTee824E···· \
    -e RUN_MODE=beast \
    linzeyi/hivision_idphotos 

📖 Cite Projects

  1. MTCNN:
@software{ipazc_mtcnn_2021,
    author = {ipazc},
    title = {{MTCNN}},
    url = {https://github.com/ipazc/mtcnn},
    year = {2021},
    publisher = {GitHub}
}
  1. ModNet:
@software{zhkkke_modnet_2021,
    author = {ZHKKKe},
    title = {{ModNet}},
    url = {https://github.com/ZHKKKe/MODNet},
    year = {2021},
    publisher = {GitHub}
}

Q&A

1. How to modify preset sizes and colors?

  • Size: After modifying size_list_EN.csv, run app.py again. The first column is the size name, the second column is the height, and the third column is the width.
  • Color: After modifying color_list_EN.csv, run app.py again. The first column is the color name, and the second column is the Hex value.

2. How to Change the Watermark Font?

  1. Place the font file in the hivision/plugin/font folder.
  2. Change the font_file parameter value in hivision/plugin/watermark.py to the name of the font file.

3. How to Add Social Media Template Photos?

  1. Place the template image in the hivision/plugin/template/assets folder. The template image should be a 4-channel transparent PNG.
  2. Add the latest template information to the hivision/plugin/template/assets/template_config.json file. Here, width is the template image width (px), height is the template image height (px), anchor_points are the coordinates (px) of the four corners of the transparent area in the template; rotation is the rotation angle of the transparent area relative to the vertical direction, where >0 is counterclockwise and <0 is clockwise.
  3. Add the name of the latest template to the TEMPLATE_NAME_LIST variable in the _generate_image_template function of demo/processor.py.

4. How to Modify the Top Navigation Bar of the Gradio Demo?

  • Modify the demo/assets/title.md file.

📧 Contact Us

If you have any questions, please email zeyi.lin@swanhub.co


Contributors

Zeyi-LinSAKURA-CATFeudalmanswpfYKaikaikaifangShaohonChenKashiwaByte


Thanks for support

Stargazers repo roster for @Zeyi-Lin/HivisionIDPhotos

Forkers repo roster for @Zeyi-Lin/HivisionIDPhotos

Star History Chart

Lincese

This repository is licensed under the Apache-2.0 License.

Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright [yyyy] [name of copyright owner] Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

About

No description expand collapse
Python and 3 more languages
Apache-2.0
Cancel

Releases

No release

Activities

3个月前创建了仓库
can not load any more
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化