PP-OCRv3_server_det
Introduction
PP-OCRv3_server_det is one of the PP-OCRv3_det series models, a set of text detection models developed by the PaddleOCR team. This server-optimized text detection model offers higher efficiency, making it ideal for deployment on edge devices.
Quick Start
Installation
- PaddlePaddle
Please refer to the following commands to install PaddlePaddle using pip:
# for CUDA11.8
python -m pip install paddlepaddle-gpu==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/
# for CUDA12.6
python -m pip install paddlepaddle-gpu==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/
# for CPU
python -m pip install paddlepaddle==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/
For details about PaddlePaddle installation, please refer to the PaddlePaddle official website.
- PaddleOCR
Install the latest version of the PaddleOCR inference package from PyPI:
python -m pip install paddleocr
Model Usage
You can quickly experience the functionality with a single command:
paddleocr text_detection \
--model_name PP-OCRv3_server_det \
-i https://cdn-uploads.huggingface.co/production/uploads/681c1ecd9539bdde5ae1733c/3ul2Rq4Sk5Cn-l69D695U.png
You can also integrate the model inference of the text detection module into your project. Before running the following code, please download the sample image to your local machine.
from paddleocr import TextDetection
model = TextDetection(model_name="PP-OCRv3_server_det")
output = model.predict(input="https://cdn-uploads.huggingface.co/production/uploads/681c1ecd9539bdde5ae1733c/3ul2Rq4Sk5Cn-l69D695U.png", batch_size=1)
for res in output:
res.print()
res.save_to_img(save_path="./output/")
res.save_to_json(save_path="./output/res.json")
After running, the obtained result is as follows:
{'res': {'input_path': '/root/.paddlex/predict_input/3ul2Rq4Sk5Cn-l69D695U.png', 'page_index': None, 'dt_polys': array([[[ 635, 1432],
...,
[ 635, 1449]],
...,
[[ 356, 106],
...,
[ 354, 129]]], dtype=int16), 'dt_scores': [0.9872102548912663, 0.7382263155744568, ..., 0.7389449598798481]}}
The visualized image is as follows:
For details about usage command and descriptions of parameters, please refer to the Document.
Pipeline Usage
The ability of a single model is limited. But the pipeline consists of several models can provide more capacity to resolve difficult problems in real-world scenarios.
PP-OCRv3
The general OCR pipeline is used to solve text recognition tasks by extracting text information from images and outputting it in text form. And there are 5 modules in the pipeline:
- Document Image Orientation Classification Module (Optional)
- Text Image Unwarping Module (Optional)
- Text Line Orientation Classification Module (Optional)
- Text Detection Module
- Text Recognition Module
Run a single command to quickly experience the OCR pipeline:
paddleocr ocr -i https://cdn-uploads.huggingface.co/production/uploads/681c1ecd9539bdde5ae1733c/3ul2Rq4Sk5Cn-l69D695U.png \
--text_detection_model_name PP-OCRv3_server_det \
--text_recognition_model_name PP-OCRv3_mobile_rec \
--use_doc_orientation_classify False \
--use_doc_unwarping False \
--use_textline_orientation False \
--save_path ./output \
--device gpu:0
Results are printed to the terminal:
{'res': {'input_path': '/root/.paddlex/predict_input/3ul2Rq4Sk5Cn-l69D695U.png', 'page_index': None, 'model_settings': {'use_doc_preprocessor': True, 'use_textline_orientation': False}, 'doc_preprocessor_res': {'input_path': None, 'page_index': None, 'model_settings': {'use_doc_orientation_classify': False, 'use_doc_unwarping': False}, 'angle': -1}, 'dt_polys': array([[[ 354, 106],
...,
[ 354, 129]],
...,
[[ 635, 1433],
...,
[ 635, 1449]]], dtype=int16), 'text_det_params': {'limit_side_len': 64, 'limit_type': 'min', 'thresh': 0.3, 'max_side_limit': 4000, 'box_thresh': 0.6, 'unclip_ratio': 1.5}, 'text_type': 'general', 'textline_orientation_angles': array([-1, ..., -1]), 'text_rec_score_thresh': 0.0, 'rec_texts': ['Algorithms for the Markov Entropy Decomposition', 'AndrewJ.FerrisandDavidPoulin', 'DepartementdePhysique,UniversitédeSherbrooke,Québec,JIK2Rl,Canada', '(Dated:October 31,2018)', 'The Markov entropy decomposition (MED) is a recently-proposed, cluster-based simulation method for fi-', 'nite temperature quantum systems with arbitrary geometry.In this paper, we detail numerical algorithms for', 'performing the required steps of the MED,principally solving a minimization problem with a preconditioned', '[', "Newton'salgorithm,aswellashowtoextractglobalsusceptibilitiesandthermalresponses.Wedemonstrate", 'thepowerofthemethodwiththespin-1/2XXZmodelonthe2Dsquarelattice,includingtheextractionof', 'critical points and details of eachphase. Although the method shares some qualitative similarities with exact-', 'diagonalization, we showthe MEDisboth moreaccurateand significantly moreflexible.', 'PACS numbers: 05.10.a, 02.50.Ng, 03.67.a, 74.40.Kb', 'I.INTRODUCTION', 'Thisapproximationbecomesexactinthecaseofa1Dquan-', 'tum (or classical) Markov chain [1O], and leads to an expo-', 'Although the equations governing quantum many-body', 'nentialreductionofcostforexactentropycalculationswhen', 'systems are simpletowritedown,finding solutions forthe', 'the global density matrix is a higher-dimensional Markov net-', 'majority of systems remains incrediblydifficult.Modern', 'workstate[12,13].', 'physics finds itself in need of new tools to compute the emer-', 'ThesecondapproximationusedintheMEDapproachis', 'gent behavior of large, many-body systems.', 'related to the N-representibility problem. Given a set of lo-', 'There has been a great variety of tools developed to tackle', 'cal but overlappingreduced density matrices (pij,it is a very', 'many-bodyproblems, but in general, large2D and 3D quan-', 'challenging problem to determine if there exists a global den-', 'tum systems remain hard to deal with.Most systems are', 'sity operator which is positive semi-definite and whose partial', 'thoughttobenon-integrable,soexactanalyticsolutionsare', 'trace agrees with each pi. This problem is QMA-hard (the', 'notusuallyexpected.Directnumericaldiagonalizationcanbe', 'quantum analogue of NP[14,15],and is hopelessly diffi-', 'performedforrelatively small systems-howeverthe emer-', 'cult to enforce. Thus, the second approximation employed', 'gent behavior of a system in the thermodynamic limit may be', 'involvesignoringglobalconsistencywithapositiveopera-', 'difficulttoextract,especiallyinsystemswithlargecorrelation', 'tor, while requiring local consistency on any overlapping re-', 'lengths.MonteCarloapproachesaretechnicallyexact(upto', 'gions between the pi.At the zero-temperature limit, the MED', 'samplingerror),butsufferfromtheso-calledsignproblem', 'approachbecomes analogousto the variational nth-orderre-', 'forfermionic,frustrated,or dynamicalproblems.Thus we are', 'duced density matrix approach, where positivity is enforced', ':', 'limited to search forcleverapproximationsto solve thema-', 'onallreduceddensitymatricesofsizen[16-18].', 'jority of many-bodyproblems.', 'TheMEDapproach is an extremelyflexible clustermethod,', 'Over the past century,hundreds of such approximations', 'applicable to both translationally invariant systems of any di-', 'havebeenproposed,andwewillmentionjustafewnotable', 'mensioninthethermodynamiclimit,aswellasfinitesystems', 'examples applicable to quantum lattice models. Mean-field', 'or systems without translational invariance (e.g. disordered', 'theory is simple and frequently arrives at the correct quali-', 'lattices,or harmonically trapped atoms in optical lattices).', 'tativedescription,butoftenfailswhencorrelationsareim-', 'ThefreeenergygivenbyMEDisguaranteedtolowerbound', 'portant.Density-matrix renormalisation group (DMRG) [1]', 'the true free energy, which in turn lower-bounds the ground', 'is efficient and extremely accurate at solving 1D problems,', 'stateenergy-thus providing anatural complementto varia-', 'butthecomputationalcostgrowsexponentiallywithsystem', 'tional approaches which upper-bound the ground state energy.', 'sizeintwo-orhigher-dimensions[2,3].Relatedtensor-', 'The ability to provide a rigorous ground-state energy window', 'networktechniques designedfor2D systemsare still intheir', 'is a powerful validation tool,creating a very compelling rea-', 'infancy[4-6].Series-expansionmethods [7] can be success-', 'sontousethisapproach.', 'ful, but may diverge or otherwise converge slowly, obscuring', 'In this paper we paper we present a pedagogical introduc-', 'the state in certain regimes. There exist a variety of cluster-', 'tiontoMED,includingnumericalimplementationissuesand', 'based techniques, such as dynamical-mean-field theory [8]', 'applications to 2D quantum lattice models in the thermody-', 'and density-matrix embedding [9].', 'namic limit.In Sec.II,we give a brief derivation of the', 'Here we discuss the so-called Markov entropy decompo-', 'Markoventropydecomposition.SectionIlIoutlinesarobust', 'sition (MED),recently proposed by Poulin &Hastings [10]', 'numericalstrategyforoptimizingtheclustersthatmakeup', '(and analogous to a slightly earlier classical algorithm [11]).', 'thedecomposition.InSec.IVweshowhowwecanextend', 'This is a self-consistent clustermethod for finite temperature', 'thesealgorithmstoextractnon-trivial information,suchas', 'systemsthattakesadvantageofanapproximationofthe(von', 'specific heat and susceptibilities.Wepresent an application of', 'Neumann)entropy.In[1o],itwas shown that the entropy', 'themethodtothespin-1/2XXZmodelona2Dsquarelattice', 'persitecanberigorouslyupperboundedusingonlylocalin-', 'inSec.V,describinghowtocharacterizethephasediagram', 'formationalocal,reduced densitymatrix on N sites, say.', 'anddeterminecriticalpoints,beforeconcludinginSec.VI.'], 'rec_scores': array([0.92588544, ..., 0.91773278]), 'rec_polys': array([[[ 354, 106],
...,
[ 354, 129]],
...,
[[ 635, 1433],
...,
[ 635, 1449]]], dtype=int16), 'rec_boxes': array([[ 354, ..., 129],
...,
[ 635, ..., 1449]], dtype=int16)}}
If save_path is specified, the visualization results will be saved under save_path
. The visualization output is shown below:
The command-line method is for quick experience. For project integration, also only a few codes are needed as well:
from paddleocr import PaddleOCR
ocr = PaddleOCR(
text_detection_model_name="PP-OCRv3_server_det",
text_recognition_model_name="PP-OCRv3_mobile_rec",
use_doc_orientation_classify=False, # Disables document orientation classification model via this parameter
use_doc_unwarping=False, # Disables text image rectification model via this parameter
use_textline_orientation=False, # Disables text line orientation classification model via this parameter
)
result = ocr.predict("https://cdn-uploads.huggingface.co/production/uploads/681c1ecd9539bdde5ae1733c/3ul2Rq4Sk5Cn-l69D695U.png")
for res in result:
res.print()
res.save_to_img("output")
res.save_to_json("output")
For details about usage command and descriptions of parameters, please refer to the Document.
Links
- Downloads last month
- 50