Most Viewed Content:

OpenAI Launched Assistants API, Allowing Developers to Customize AI Assistants with One Click

At today's OpenAI's first developer conference, OpenAI launched the...

Toyota responds to continued production cuts in the next 3 months: easing pressure on dealer earnings

In response to the news that "production will continue...

Google to bring PWA application backup & restore function for Chrome/android

According to thespAndroid reports, GitHub's Chromium repository recently added...

Xiaomi’s new technology: convert the table in the picture into an Excel file

Lei Jun, the founder of Xiaomi, introduced a set of table recognition algorithms developed by Xiaomi, which efficiently and accurately converts tables in pictures into editable Excel files, significantly improving the user experience. Table recognition refers to the recognition of table structure and text information in pictures into a data format that can be understood by computers, which has wide practical value in office, business and education scenarios, and has been a hot issue in document analysis research.

Around this problem, Xiaomi has developed a set of table recognition algorithms, which efficiently and accurately extracts the tables in pictures and converts them into editable Excel files. The algorithm has been successfully implemented in Xiaomi 10S series, MIX Fold 2, and other flagship models, you can access the experience from Album – More – Table Recognition, or swipe to enter.

Form detection algorithm

Xiaomi said that the table detection algorithm mainly extracts the table area accurately from the picture and corrects the table to get a flat table picture for the next step of table recognition.

The table recognition algorithm mainly extracts the table structure and table text content from the picture, and then combines these information effectively to output an editable Excel table.

Form detection has the following difficulties: on the one hand, the algorithm and memory on the cell phone are limited, on the other hand, the requirements for the form detection results are very high, the form often contains other text around it, and if the detection results are not accurate, it will have a negative impact on the recognition results later.

Xiaomi’s table detection algorithm will detect both the table area and the four corner points of the table, and through perspective transformation and our self-developed anti-distortion algorithm to get a flat table with only the table area, the effect is shown in the figure.

Since the algorithm runs on the cell phone side, it needs to ensure the running speed and model size, Xiaomi adopts a very lightweight one-stage detection framework, backbone using shuffleNetV2.

regressing the key point information while detecting the table box to facilitate perspective correction of the table, and using Wing loss instead of L1 loss to make the key point regression more accurate.

In terms of data, the algorithm is used to mine a large amount of form detection data from public data at low cost, which significantly improves the form detection effect. The final model size is about 1M and runs smoothly on Xiaomi phones.

Table Recognition Algorithm

The table recognition algorithm runs on the server side and contains the following main modules: text detection, text recognition, table structure prediction, cell matching, alignment algorithm, and Excel export.

The current mainstream approach is to represent tables in HTML hypertext, and then encode the HTML to predict the HTML sequence and the corresponding coordinate information.

This method has achieved good results on open source datasets, and Ping An Technology of China and Baidu have also adopted this scheme, but too many tags in HTML lead to error-prone table structure recognition.

To address the shortcomings of this method, we adopt a new encoding method for tables, which can represent tables of arbitrary structure with only four tags, greatly improving the accuracy of table structure recognition.

Table recognition is accelerated by using the Faster transformer inference framework during deployment, which officially claims that Xiaomi’s inference speed is improved by about 20 times, significantly improving user experience.

Summary

The algorithm can efficiently and easily extract tables from images, greatly improving office efficiency. Xiaomi said that engineers will continue to improve the recognition experience of document-based images in Xiaomi phones.

Latest

Mercedes-Benz V-Class/Vito declaration drawing revealed, improved power

Recently, we saw the application drawings of the new...

Alfa Romeo Milano to be renamed Junior

Recently, according to overseas reports, Alfa Romeo Milano will...

Official images of the new Haval H6 released

Haval brand general manager Zhao Yongpo posted an official...

iQOO Z9 / Z9x / Z9 Turbo series phones full specifications exposed, release on April 24

The iQOO Z9 series will be released on April...

Newsletter

Don't miss

Mercedes-Benz V-Class/Vito declaration drawing revealed, improved power

Recently, we saw the application drawings of the new...

Alfa Romeo Milano to be renamed Junior

Recently, according to overseas reports, Alfa Romeo Milano will...

Official images of the new Haval H6 released

Haval brand general manager Zhao Yongpo posted an official...

iQOO Z9 / Z9x / Z9 Turbo series phones full specifications exposed, release on April 24

The iQOO Z9 series will be released on April...

OPPO A1i and A1s prices announced, starting from 1099 and 1199 RMB

OPPO today announced the prices of its new models...
Threza Gabriel
Threza Gabrielhttps://www.techgoing.com
Threza Gabriel is a news writer at TechGoing. TechGoing is a global tech media to brings you the latest technology stories, including smartphones, electric vehicles, smart home devices, gaming, wearable gadgets, and all tech trending.

Tesla layoffs: Some employees arrived at the factory to find out they were fired

According to BusinessInsider, multiple Tesla factory workers revealed that they only learned that they were fired after security personnel scanned their badges and were...

iQOO Z9 / Z9x / Z9 Turbo series phones full specifications exposed, release on April 24

The iQOO Z9 series will be released on April 24. It is reported that three models will be launched at this conference, namely iQOO...

Tesla’s Cybertruck Charging Speed Surpasses Expectations, 20% Boost via OTA Update

Tesla will increase the charging speed of Cybertruck by 20% through OTA updates. When Tesla releases a new car, it usually prioritizes the delivery of...