(19)
(11)EP 2 549 406 B1

(12)EUROPEAN PATENT SPECIFICATION

(45)Mention of the grant of the patent:
10.06.2020 Bulletin 2020/24

(21)Application number: 12172723.4

(22)Date of filing:  20.06.2012
(51)International Patent Classification (IPC): 
G06K 9/32(2006.01)
G06F 17/27(2006.01)
G06K 9/34(2006.01)

(54)

Method of and device for identifying direction of characters in image block

Verfahren und Vorrichtung zur Identifizierung der Richtung von Zeichen in einem Bildblock

Procédé et dispositif pour identifier la direction des caractères dans un bloc d'image


(84)Designated Contracting States:
AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

(30)Priority: 20.07.2011 CN 201110209832

(43)Date of publication of application:
23.01.2013 Bulletin 2013/04

(73)Proprietor: FUJITSU LIMITED
Kawasaki-shi, Kanagawa 211-8588 (JP)

(72)Inventors:
  • Sun, Jun
    100025 Chaoyang District, Beijing (CN)
  • Naoi, Satoshi
    100025 Chaoyang District, Beijing (CN)

(74)Representative: Haseltine Lake Kempner LLP 
Lincoln House, 5th Floor 300 High Holborn
London WC1V 7JH
London WC1V 7JH (GB)


(56)References cited: : 
EP-A2- 1 073 001
US-B1- 6 804 414
EP-A2- 1 703 444
  
      
    Note: Within nine months from the publication of the mention of the grant of the European patent, any person may give notice to the European Patent Office of opposition to the European patent granted. Notice of opposition shall be filed in a written reasoned statement. It shall not be deemed to have been filed until the opposition fee has been paid. (Art. 99(1) European Patent Convention).


    Description

    Field of the Invention



    [0001] The present invention generally relates to processing of a document image and in particular to a method of and device for identifying the direction of characters in an image block.

    Background of the Invention



    [0002] When a user scans a stack of documents using a scanner or the like, every page of each document is placed upside up for an ideal input. With the documents placed upside up, the user can read the documents easily, and scanned images of the documents can be read by the user without adjusting their directions. However in a practical application, the documents to be scanned by the user tend to be placed alternatively at angles of 0° (upside up) and 180° (upside down) as well as 90° and 270° (transversely). It would be burdensome and time-consuming for the user to check and adjust the placement directions of the documents page-by-page prior to their scanning. Therefore the scanner is designed with a function of judging automatically the direction of a document image. With the function of judging automatically the direction of a document image, the scanned document image can be adjusted to be placed upside up to thereby alleviate the burden on the user and improve the efficiency of usage by the user.

    [0003] In a traditional method of judging automatically the direction of a document image, a text line in the document image is located; optical character recognition processing is performed respectively in four possible directions to obtain recognized characters and corresponding confidences or recognition distances thereof in the four possible directions; and the average confidence or the average recognition distance of the text line is calculated. The direction with the largest average confidence or the smallest average recognition distance is judged as the direction of the text line, and the direction of the document image is further judged from the direction of the text line. The direction of the text line refers to the upside-up direction of the text line, and the direction of the document image refers to the upside-up direction of the document image. Hereinafter, the character direction (or the direction of characters) refers to the upside-up direction of (the) characters.

    [0004] EP1 703 444 A2 discloses a method and a device in accordance with the preamble of each independent claim. The method and device select a language to be considered first, and character recognition is performed for each possible orientation of an image block with the selected language, using training data for the selected language. After all orientations have been considered, a next language is considered and the process is repeated. After all languages have been considered in all orientations, the orientation and language producing the highest recognition score is taken as the final result.

    Summary of the Invention



    [0005] As illustrated in Fig.1, an image block of a text line "TIP AMOUNT" is input in the assumed 0° direction and is rotated by 180° to obtain an image block of the text line in the 180° direction. Since processes in the directions of 90° and 270° are similar to those in the directions of 0° and 180°, only the processes in the directions of 0° and 180° will be described here as an example. Optical character recognition processing is performed respectively on the image blocks of the text line in the directions of 0° and 180° to obtain sub image blocks, recognized characters corresponding to the sub image blocks and confidences thereof in the two directions as illustrated in Fig.1.

    [0006] In the traditional method, the average confidence of the recognized characters in the 0° direction equals to (0.54+0.36+0.48+0.61+0.62+0.61+0.58+0.65)/8=0.55625, and the average confidence of the recognized characters in the 180° direction equals to (0.62+0.58+0.65+0.62+0.46+0.50+0.61)/7=0.5771. Since 0.55625 is smaller than 0.5771, the 180° direction (i.e., the direction with the higher average confidence) might be judged in error as the direction of the characters in the image block of the text line in the traditional method.

    [0007] One of reasons for the foregoing error arising is because recognized characters generated from an optical character recognition engine supporting multiple languages may sometimes belong to different languages. This is easy to understand because a character belonging to a specific language is recognized easily as a character in the correct language when the character is placed upside-up but may not always be recognized as a character in the same language after the character is rotated by 90°, 180° and 270° but may instead appear similar to a character in another language.

    [0008] The consistency in language has been disregarded in the prior art. The inventors of the invention have realized that a character rotated by a number of angles is recognized to thereby generate recognition results of optical character recognition in respective directions among which the probability that the character placed upside-up is judged in error by the optical character recognition engine as a character in another language is lowest and therefore respective recognized characters in that direction have a higher consistency in language. Thus when optical character recognition processing is performed on the image block including characters by assuming various directions as assumed character directions respectively to obtain sub image blocks, recognized characters corresponding to the sub image blocks and correctness measures thereof in each of the assumed character directions, a language to which the characters in the image block belong can be determined in at least one of the directions, and a recognized character in recognition results in another language than the determined language is more likely to correspond to a character which is not placed upside-up.

    [0009] The consistency in language has been disregarded in the traditional method for judging the direction of a text line from an average recognition confidence, thus the performance thereof is degraded to some extent.

    [0010] It is desirable to address the foregoing problem by proposing a method of and device for identifying correctly the direction of characters in an image block. This solution can take the consistency in language into account and hereby adjust the correctness measure (confidence or recognition distance) corresponding to a sub image block to thereby improve the accuracy of judging automatically the direction of a document image.

    [0011] In order to attain the foregoing desirable feature, there is provided according to aspects of the invention a method of identifying the direction of characters in an image block, a device for identifying the direction of characters in an image block, and a computer program product, in accordance with each independent claim.

    Brief Description of the Drawings



    [0012] The foregoing and other desirable features and advantages of the invention will become more apparent from the detailed description of the embodiments thereof given below in conjunction with the accompanying drawings in which those components are merely intended to illustrate the principle of the invention and throughout which identical or like technical features or components are denoted with identical or like reference numerals. In the drawings:

    Fig.1 illustrates sub image blocks, recognized characters and confidences thereof as well as candidate characters and confidences thereof in the directions of 0° and 180° resulting from optical character recognition processing of an image block of a text line;

    Fig.2 illustrates sub image blocks, recognized characters and recognition distances thereof as well as candidate characters and recognition distances thereof in the directions of 0° and 180° resulting from optical character recognition processing of an image block of a text line;

    Fig.3 illustrates a flow chart of a method of identifying the direction of characters in an image block according to an embodiment of the invention;

    Fig.4 illustrates a structural block diagram of an identifying device for identifying the direction of characters in an image block according to an embodiment of the invention; and

    Fig.5 illustrates a schematic block diagram of a computer that can be used for implementing the method and device according to the embodiments of the invention.


    Detailed Description of the Invention



    [0013] Exemplary embodiments of the present invention will be detailed below in conjunction with the accompanying drawings. For the sake of clarity and conciseness, not all the features of practical implementations will be described in the specification. However it shall be appreciated that during developing any of such practical implementations, numerous implementation-specific decisions shall be made to achieve the developer's specific goals, for example, to comply with those system- and business-related constraining conditions which may vary from one implementation to another. Moreover it shall also be appreciated that such a development job might be very complex and time-consuming but may simply be a routine task for those skilled in the art benefiting from this disclosure.

    [0014] It shall further be noted that only those device structures and/or process steps closely relevant to the solutions of the invention will be illustrated in the drawings while other details less relevant to the invention are omitted so as not to obscure the invention due to those unnecessary details. Furthermore it shall further be noted that an element and a feature described in one of the drawings or implementations of the invention can be combined with an element and a feature illustrated in one or more other drawings or implementations.

    [0015] As assumed in this context, a text line has been located in a document image, and an image block containing the text line has been cropped from the document image. The embodiments are focused not upon how to locate the text line in the document image but upon how to identify correctly the direction of characters in the image block containing the text line.

    [0016] It is typical to generally consider four primary directions as assumed character directions, i.e., the direction of the image block itself (the 0° direction), the direction of the image block rotated by 180°, the direction of the image block rotated by 90° and the direction of the image block rotated by 270°, also referred to as two transverse directions and two longitudinal directions of the image block. The directions of 90° and 270° generally apply to possibly vertically written characters, e.g., of Chinese, Japanese, etc. Since a scenario with the directions of 0° and 180° is similar to a scenario with the directions of 90° and 270°, only the scenario with the directions of 0° and 180° will be described below by way of an example.

    [0017] A flow of a method of identifying the direction of characters in an image block according to an embodiment of the invention will be described below with reference to Fig.3.

    [0018] Firstly optical character recognition processing is performed on the image block by assuming 0° and 180° as assumed character directions to obtain sub image blocks, recognized characters corresponding to the sub image blocks and correctness measures thereof in the directions of 0° and 180° (at S301). Fig.1 illustrates an example of sub image blocks which are numbered, recognized characters and confidences in the directions of 0° and 180°. Fig.2 illustrates an example of sub image blocks which are numbered, recognized characters and recognition distances in the directions of 0° and 180°. A recognition result of optical character recognition generally includes divided-into sub image blocks, recognized characters corresponding to the sub image blocks and correctness measures of the recognized characters. A correctness measure reflects the reliability of a recognized character and is typically a confidence or a recognition distance. The higher the confidence is, the higher possibility that the recognized character is correct will be; and the shorter the recognition distance is, the higher possibility that the recognized character is correct will be.

    [0019] Next a language group which the characters in the image block belong to is determined (at S302). As described above, one of reasons for an arising error is because the consistency in language has been disregarded. In real life, written characters typically include a large number of characters in a primary language and also a small number of pidgin characters in another language, Arabic numerals, non-character symbols, etc. For example, characters written in Chinese may be used in combination with other symbols of "good luck", "5782", "

    ", "%", etc., than Chinese characters. Therefore the invention will not be limited to any specific language but can be used with the concept of a language group.

    [0020] The language group includes a Latin language group, a Korean language group and a Chinese/Japanese language group.

    [0021] The Latin language group includes Latin letters, Arabic numerals, non-character symbols, etc. The non-character symbols includes, for example, "

    ", "%", "(", ")", "&", etc.

    [0022] The Korean language group includes Korean letters, Latin letters, Arabic numerals, non-character symbols, etc. Since Latin letters frequently occur in medical nouns, plant varieties, etc., and also are frequently used with another language, the non-Latin language groups include Latin letters, Arabic numerals, non-character symbols, etc.

    [0023] However the other language groups, such as the Korean language group, include all the contents of the Latin language group, so the other language groups shall be further characterized to be distinguished from the Latin language group in order to prevent the Latin language group from being judged in error as another language group. Therefore a core character set is further set for the other language groups. A core character set of a language group includes a set of characters distinguishing the language group from the Latin language group. For example, a core character set of the Korean language group includes Korean letters.

    [0024] Similarly the Chinese/Japanese language group includes simplified Chinese characters, traditional Chinese characters, Chinese characters in Japanese, Japanese hiragana, Japanese katakana, Latin letters, Arabic numerals and non-character symbols. A core character set of the Chinese/Japanese language group includes simplified Chinese characters, traditional Chinese characters, Chinese characters in Japanese, Japanese hiragana and Japanese katakana. A reason for such an arrangement is because characters written in Japanese frequently contain Chinese characters as a result of cultural propagation in history.

    [0025] The Latin language group, the Korean language group and the Chinese/Japanese language group listed above are merely a few examples for a language group. Other designs for a language group can occur to those skilled in the art in light of the listed examples. For example, a Russian language group may include Russian letters, Latin letters, Arabic numerals, non-character symbols, etc.

    [0026] The language group to which the characters in the image block belong can be determined in numerous ways. An illustrative method based upon a result of optical character recognition processing will be presented here, but the invention will not be limited thereto.

    [0027] In each of the assumed character directions, the proportion of sub image blocks corresponding to recognized characters which belong to each of the language groups in all of the sub image blocks in the assumed character direction is calculated respectively. Under a determination condition for the Latin language group, when the proportion of sub image blocks corresponding to recognized characters which belong to the Latin language group in one of the assumed character directions in all of the sub image blocks in the assumed character direction is larger than a first threshold, the Latin language group is determined as the language group which the characters in the image block belong to. The first threshold for a language group can be determined from a statistical analysis of a corpus consisted of linguistic materials that belong to the language group. The first threshold is relatively high because characters belonging to a language shall predominate among characters written in that language. The statistical analysis shows that the value range of the first threshold for the Latin language group is preferably (0.65, 0.85) and the more preferred value is 0.7. Under a determination condition for the Korean language group, when the proportion of sub image blocks corresponding to recognized characters which belong to the Korean language group in one of the assumed character directions in all of the sub image blocks in the assumed character direction is larger than a first threshold and the proportion of sub image blocks corresponding to recognized characters which belong to a core character set of the Korean language group in all of the sub image blocks in the assumed character direction is larger than a second threshold, the Korean language group is determined as the language group which the characters in the image block belong to. Also the second threshold for a language group can be determined from the statistical analysis of a corpus consisted of linguistic materials that belong to the language group. Apparently the first threshold is higher than the second threshold. The statistical analysis shows that the value range of the first threshold for the Korean language group is preferably (0.65, 0.85) and the more preferred value is 0.7, and the value range of the second threshold is preferably (0.45, 0.65) and the more preferred value is 0.5. A determination method for the Chinese/Japanese language group is similar to that for the Korean language group. Different first and second thresholds can be set for a different language group according to the result of a statistical analysis and based upon empirical and linguistic knowledge.

    [0028] As described, it is determined in each of the assumed character directions whether the language group which the characters in the image block belong to is the Latin language group, the Korean language group or the Chinese/Japanese language group, etc. The language group which the characters in the image block belong to can be determined as long as one of the foregoing conditions is satisfied in one of the assumed character directions. All the elements of the Latin language group are introduced into the non-Latin language groups in order to comply with the real composition of characters. Also the non-Latin language groups can be distinguished effectively from the Latin language group due to the introduced concept of a core character set to thereby avoid such a situation that the determined language group which the characters in the image block belong to might vary from one assumed character direction to another.

    [0029] For example, the recognized characters in the 0° direction are U, P, A, H, O, V, N and ┌ in the example illustrated in Fig.1. The first threshold is set to 0.7 and the second threshold is set to 0.5. Particularly the number of recognized characters belonging to the Latin language group is 8, and each of the recognized characters correspond to one sub image block; and the number of sub image blocks in that direction is 8, and 100% is above the first threshold 0.7, so the language group which the characters in the image block belong to is determined as the Latin language group. Similarly the language group which the characters in the image block belong to can also be determined in the 180° direction as the Latin language group. The Korea language group and the Chinese/Japanese language group are precluded because they fail to satisfy the condition for the second threshold although they satisfy the condition for the first threshold.

    [0030] Next a correctness measure corresponding to a sub image block which corresponds to a recognized character not belonging to the determined language group in each of the assumed character directions is adjusted at S303.

    [0031] Particularly the confidence corresponding to the sub image block is decreased or the recognition distance corresponding to the sub image block is increased. This is because a recognized character in another language group than the correctly determined language group is more likely to be an incorrect recognition result arising from rotation of a character placed upside-up. Thus the confidence can be decreased or the recognition distance can be increased to thereby determine the direction of the characters in the image block more accurately based upon the correctness measures. Therefore it can be appreciated that the confidence can be decreased or the recognition distance can be increased as long as this is done reasonably.

    [0032] A specific implementation will be presented here. When the image block is divided and recognized with an optical character recognition engine, a plurality of sub image blocks are usually obtained, and a plurality of candidate characters with different correctness measures can be given for each of the sub image blocks and arranged in a descending order of confidences or an ascending order of recognition distances as illustrated at the bottoms of Fig.1 and Fig.2. Typically a candidate character with the highest confidence or the shortest recognition distance is presented as a recognized character.

    [0033] However as described above, a recognized character in a specific assumed character direction, e.g., the recognized characters corresponding to the sub image blocks N3 and N5 in Fig.1 and Fig.2, etc., may not belong to the language group which the characters in the image block belong to, but a part of the candidate characters in the assumed character direction for N3 and N5 may belong to the language group which the characters in the image block belong to. Thus the adjusted confidence or recognition distance corresponding to the sub image block can be set to the maximum confidence among confidences corresponding to all of the candidate characters that belong to the language group which the characters in the image block belong to in the assumed character direction and correspond to the sub image block or a minimum recognition distance among recognition distances corresponding to all of the candidate characters that belong to the language group which the characters in the image block belong to in the assumed character direction and correspond to the sub image block. Apparently the adjusted confidence has been decreased or the adjusted recognition distance has been increased. In the case that none of the candidate characters in the assumed character direction, which correspond to the sub image block, belongs to the language group to which the characters in the image block belong, the adjusted confidence or recognition distance corresponding to the sub image block can be set to the minimum confidence among confidences corresponding to all of the candidate characters in the assumed character direction which correspond to the sub image block or the maximum recognition distance among recognition distances corresponding to all of the candidate characters in the assumed character direction which correspond to the sub image block.

    [0034] For example, the recognized characters of N3 and N5 do not belong to the Latin language group determined at S302 in the examples illustrated in Fig.1 and Fig.2. Thus the candidate characters corresponding to N3 are searched for the candidate character belonging to the Latin language group and with the highest confidence or the shortest recognition distance. Fig.1 illustrates the candidate character "n" belonging to the Latin language group and with the highest confidence 0.49 among the candidate characters corresponding to N3 in that direction. Therefore 0.65 is replaced with 0.49 as the confidence corresponding to N3. Similarly the confidence corresponding to N5 is adjusted to 0.39. Alike the recognition distances corresponding to N3 and N5 are adjusted respectively to 920 and 1230 in Fig.2.

    [0035] Next an accumulative correctness measure in each of the assumed character directions is calculated based on the adjusted correctness measure at S304, and the direction of the characters in the image block is identified according to the accumulative correctness measures at S305.

    [0036] An accumulative correctness measure includes an accumulative confidence and an accumulative recognition distance. An accumulative confidence is a correctness measure characterizing an identification result of the image block of the text line in a direction as a whole. Typically the sum of confidences corresponding to all the sub image blocks in an assumed character direction can be taken as an accumulative confidence in that direction. Alternatively the arithmetic average of confidences corresponding to all the sub image blocks in an assumed character direction can be taken as an accumulative confidence in that direction. A direction with a higher accumulative confidence is more likely to be a correct identification result. The average confidence is more reliable because the number of sub image blocks may vary from one direction to another.

    [0037] An accumulative recognition distance is also a correctness measure characterizing an identification result of the image block of the text line in a direction as a whole. Alike the sum of recognition distances corresponding to all the sub image blocks in an assumed character direction can be taken as an accumulative confidence in that direction. Alternatively the arithmetic average of recognition distances corresponding to all the sub image blocks in an assumed character direction can be taken as an accumulative recognition distance in that direction. A direction with a shorter accumulative recognition distance is more likely to be a correct identification result. The average recognition distance is more reliable because the number of sub image blocks may vary from one direction to another.

    [0038] In the example of Fig.1, after S301 to S303 are performed,

    [0039] Accumulative confidence in the 0° direction = (0.54+0.36+0.48+0.61+0.62+0.61+0.58+0.65)/8=0.55625

    [0040] Accumulative confidence in the 180° direction = (0.62+0.58+0.49+0.62+0.39+0.50+0.61)/7=0.5442

    [0041] In the example of Fig.2, after S301 to S303 are performed,

    [0042] Accumulative recognition distance in the 0° direction = (928+1279+1034+774+578+779+840+695)/8=863.375

    [0043] Accumulative recognition distance in the 180° direction = (759+840+920+769+1230+1005+790)/7=901.857

    [0044] Apparently the accumulative confidence in the 0° direction is higher than the accumulative confidence in the 180° direction or the accumulative recognition distance in the 0° direction is shorter than the accumulative recognition distance in the 180° direction after the confidences or the recognition distances are adjusted as above, thereby giving a more accurate determination result.

    [0045] A structure of an identifying device for identifying the direction of characters in an image block according to an embodiment of the invention will be described below with reference to Fig.4. As illustrated in Fig.4, the identifying device 400 for identifying the direction of characters in an image block according to the embodiment includes: an optical character recognition processing unit 401 configured for performing optical character recognition processing on the image block by assuming various directions as assumed character directions, respectively, to obtain sub image blocks, recognized characters corresponding to the sub image blocks and correctness measures thereof in each of the assumed character directions; a language group determination unit 402 configured for determining a language group which the characters in the image block belong to; a sub image block adjusting unit 403 configured for adjusting a correctness measure corresponding to a sub image block which corresponds to a recognized character not belonging to the determined language group in each of the assumed character directions; an accumulative correctness measure calculating unit 404 configured for calculating an accumulative correctness measure in each of the assumed character directions based on the adjusted correctness measure; and a character direction identifying unit 405 configured for identifying the direction of the characters in the image block according to the accumulative correctness measures.

    [0046] Since processing in the optical character recognition processing unit 401, the language group determination unit 402, the sub image block adjusting unit 403, the accumulative correctness measure calculating unit 404 and the character direction identifying unit 405 included in the identifying device 400 according to the invention is similar respectively to processing at S301-S305 of the method for identifying the direction of characters in an image block described above, a detailed description of these units will be omitted here for the sake of conciseness.

    [0047] Furthermore it shall be noted here that the respective constituent modules and units in the foregoing device can be configured in software, firmware, hardware or a combination thereof. Specific configuration means or approaches available are well known to those skilled in the art, and a repeated description thereof will be omitted here. In the case of being implemented in software or firmware, a program constituting the software is installed from a storage medium or a network to a computer with a dedicated hardware structure (e.g., a general-purpose computer 500 illustrated in Fig.5, etc.), which can perform various functions when various programs are installed thereon.

    [0048] In Fig.5, a Central Processing Unit (CPU) 501 performs various processes according to a program stored in a Read Only Memory (ROM) 502 or loaded from a storage part 508 into a Random Access Memory (RAM) 503 in which data required when the CPU 501 performs the various processes is also stored as needed. The CPU 501, the ROM 502 and the RAM 503 are connected to each other via a bus 504 to which an input/output interface 505 is also connected.

    [0049] The following components are connected to the input/output interface 505: an input part 506 (including a keyboard, a mouse, etc.); an output part 507 (including a display, e.g., a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), etc., a speaker, etc.), a storage part 508 (including a hard disk, etc.) and a communication part 509 (including a network interface card, e.g., an LAN card, a modem, etc.). The communication part 509 performs a communication process over a network, e.g., the Internet. A drive 510 is also connected to the input/output interface 505 as needed. A removable medium 511, e.g., a magnetic disk, an optical disk, an optic-magnetic disk, a semiconductor memory, etc., can be installed on the drive 510 as needed so that a computer program fetched therefrom can be installed into the storage part 508 as needed.

    [0050] In the case that the foregoing series of processes are performed in software, a program constituting the software is installed from the network, e.g., the Internet, etc., or a storage medium, e.g., the removable medium 511, etc.

    [0051] Those skilled in the art shall appreciate that such a storage medium will not be limited to the removable medium 511 illustrated in Fig.5 in which the program is stored and which is distributed separately from the device to provide a user with the program. Examples of the removable medium 511 include a magnetic disk (including a Floppy Disk (a registered trademark)), an optical disk (including Compact Disk-Read Only memory (CD-ROM) and a Digital Versatile Disk (DVD)), an optic-magnetic disk (including a Mini Disk (MD) (a registered trademark)) and a semiconductor memory. Alternatively the storage medium can be the ROM 502, the hard disk included in the storage part 508, etc., in which the program is stored and which is distributed together with the device including the same to the user.

    [0052] The invention embodiments also include a program product including machine readable instruction codes stored thereon which can perform the foregoing method according to the embodiment of invention when being read and executed by a machine.

    [0053] Correspondingly a storage medium on which the program product including the machine readable instruction codes stored thereon is embodied will also be encompassed in the disclosure of the invention. The storage medium includes but will not be limited to a floppy disk, an optical disk, an optic-magnetic disk, a memory card, a memory stick, etc.

    [0054] The identifying device for identifying the direction of characters in an image block, the identifying method of identifying the direction of characters in an image block and the corresponding program product, which are disclosed in the embodiments of the invention, can be applicable to a scanner or other image scanning devices to identify the direction of characters in a scanned document.


    Claims

    1. A method of identifying the direction of characters in an image block, comprising:

    performing optical character recognition processing on the image block by assuming various directions as assumed character directions, respectively, to obtain sub image blocks, recognized characters corresponding to the sub image blocks and correctness measures thereof in each of the assumed character directions; and

    identifying the direction of the characters in the image block using the correctness measures by:

    defining a plurality of language groups each comprising a character set;

    determining a language group which the characters in the image block belong to;

    adjusting a correctness measure corresponding to a sub image block which corresponds to a recognized character not belonging to the determined language group in each of the assumed character directions;

    calculating an accumulative correctness measure in each of the assumed character directions based on the adjusted correctness measure; and

    identifying the direction of the characters in the image block according to the accumulative correctness measures; wherein

    said determining a language group which the characters in the image block belong to comprises:

    in each of the assumed character directions, calculating the proportion of sub image blocks corresponding to recognized characters which belong to each of the language groups in all of the sub image blocks in the assumed character direction, respectively; and

    determining the language group as the language group which the characters in the image block belong to when at least one condition is fulfilled, the at least one condition being that the proportion of sub image blocks corresponding to recognized characters which belong to a certain language group in all of the sub image blocks in the assumed character direction is larger than a first threshold;

    wherein the language group comprises a Latin language group, a Korean language group and a Chinese/Japanese language group, the Latin language group comprises Latin letters, Arabic numerals and non-character symbols, the Korean language group comprises Korean letters, Latin letters, Arabic numerals and non-character symbols, the Chinese/Japanese language group comprises simplified Chinese characters, traditional Chinese characters, Chinese characters in Japanese, Japanese hiragana, Japanese katakana, Latin letters, Arabic numerals and non-character symbols.


     
    2. The method according to claim 1, wherein:

    said correctness measure comprises a confidence and a recognition distance; and

    said adjusting a correctness measure corresponding to a sub image block which corresponds to a recognized character not belonging to the determined language group in each of the assumed character directions comprises decreasing the confidence corresponding to the sub image block or increasing the recognition distance corresponding to the sub image block.


     
    3. The method according to claim 1, wherein:

    said correctness measure comprises a confidence and a recognition distance; and

    said adjusting a correctness measure corresponding to a sub image block which corresponds to a recognized character not belonging to the determined language group in each of the assumed character directions comprises adjusting the confidence or the recognition distance corresponding to the sub image block to the maximum confidence among confidences corresponding to all of the candidate characters that belong to the determined language group in the assumed character direction and correspond to the sub image block or a minimum recognition distance among recognition distances corresponding to all of the candidate characters that belong to the determined language group in the assumed character direction and correspond to the sub image block, and, when none of the candidate characters in the assumed character direction which correspond to the sub image block belong to the determined language group, taking the minimum confidence among confidences corresponding to all of the candidate characters in the assumed character direction which correspond to the sub image block or the maximum recognition distance among recognition distances corresponding to all of the candidate characters in the assumed character direction which correspond to the sub image block as the confidence or the recognition distance corresponding to the sub image block.


     
    4. The method according to any of claims 1 to 3, wherein in case of a Korean language group or a Chinese/Japanese language group, in said determining a language group which the characters in the image block belong to, said at least one condition further comprises the condition that the proportion of sub image blocks corresponding to recognized characters which belong to a core character set of the language group in all of the sub image blocks in the assumed character direction is larger than a second threshold;
    wherein the core character set of the language group comprises the characters of the language group other than Latin letters, Arabic numerals and non-character symbols.
     
    5. The method according to claim 1, wherein a value range of the first threshold for the Latin language group is 0.65 to 0.85.
     
    6. The method according to claim 4, wherein:

    a core character set of the Korean language group comprises Korean letters;

    a core character set of the Chinese/Japanese language group comprises simplified Chinese characters, traditional Chinese characters, Chinese characters in Japanese, Japanese hiragana and Japanese katakana; and

    a value range of the first threshold for the Korean language group and the Chinese/Japanese language group is 0.65 to 0.85, and a value range of the second threshold for the Korean language group and the Chinese/Japanese language group is 0.45 to 0.65.


     
    7. The method according to any one of claims 1-6, wherein the various directions comprise two transverse directions and two longitudinal directions of the image block.
     
    8. A device for identifying the direction of characters in an image block, comprising:

    an optical character recognition processing unit configured for performing optical character recognition processing on the image block by assuming various directions as assumed character directions, respectively, to obtain sub image blocks, recognized characters corresponding to the sub image blocks and correctness measures thereof in each of the assumed character directions;

    a language group determination unit configured with a plurality of language groups each comprising a character set, for determining a language group which the characters in the image block belong to;

    a sub image block adjusting unit configured for adjusting a correctness measure corresponding to a sub image block which corresponds to a recognized character not belonging to the determined language group in each of the assumed character directions;

    an accumulative correctness measure calculating unit configured for calculating an accumulative correctness measure in each of the assumed character directions based on the adjusted correctness measure; and

    a character direction identifying unit configured for identifying the direction of the characters in the image block according to the accumulative correctness measures;

    wherein said language group determination unit is configured for, in each of the assumed character directions,

    calculating the proportion of sub image blocks corresponding to recognized characters which belong to each of the language groups in all of the sub image blocks in the assumed character direction, respectively; and

    determining the language group as the language group which the characters in the image block belong to when at least one condition is fulfilled, the at least one condition being that the proportion of sub image blocks corresponding to recognized characters which belong to a certain language group in all of the sub image blocks in the assumed character direction is larger than a first threshold;

    wherein the language group comprises a Latin language group, a Korean language group and a Chinese/Japanese language group, the Latin language group comprises Latin letters, Arabic numerals and non-character symbols, the Korean language group comprises Korean letters, Latin letters, Arabic numerals and non-character symbols, the Chinese/Japanese language group comprises simplified Chinese characters, traditional Chinese characters, Chinese characters in Japanese, Japanese hiragana, Japanese katakana, Latin letters, Arabic numerals and non-character symbols.


     
    9. The device according to claim 8, wherein the sub image block adjusting unit is configured for decreasing the confidence corresponding to a sub image block which corresponds to a recognized character not belonging to the determined language group in each of the assumed character directions or increasing the recognition distance corresponding to the sub image block.
     
    10. The device according to claim 8, wherein the sub image block adjusting unit is configured for adjusting the confidence or the recognition distance corresponding to a sub image block which corresponds to a recognized character not belonging to the determined language group in each of the assumed character directions to the maximum confidence among confidences corresponding to all of the candidate characters that belong to the determined language group in the assumed character direction and correspond to the sub image block or a minimum recognition distance among recognition distances corresponding to all of the candidate characters that belong to the determined language group in the assumed character direction and correspond to the sub image block, and, when none of the candidate characters in the assumed character direction which correspond to the sub image block belong to the determined language group, taking the minimum confidence among confidences corresponding to all of the candidate characters in the assumed character direction which correspond to the sub image block or the maximum recognition distance among recognition distances corresponding to all of the candidate characters in the assumed character direction which correspond to the sub image block as the confidence or the recognition distance corresponding to the sub image block.
     
    11. The device according to any of claims 8 to 10, wherein in case of a Korean language group or a Chinese/Japanese language group, in the language group determination unit, said at least one condition comprises the further condition that the proportion of sub image blocks corresponding to recognized characters which belong to a core character set of the language group in all of the sub image blocks in the assumed character direction is larger than a second threshold;
    wherein the core character set of the language group comprises the characters of the language group other than Latin letters, Arabic numerals and non-character symbols.
     
    12. A scanner comprising the device for identifying the direction of characters in an image block according to any one of claims 8-11.
     
    13. A computer program product comprising instructions which, when the program is executed by a computing device, causes the computing device to perform a method according to any of claims 1 to 7.
     


    Ansprüche

    1. Verfahren zur Identifizierung der Richtung von Zeichen in einem Bildblock, umfassend:

    Durchführen einer optischen Zeichenerkennungsverarbeitung auf dem Bildblock durch Annehmen jeweils von verschiedenen Richtungen als angenommene Zeichenrichtungen, um Teilbildblöcke, erkannte Zeichen entsprechend den Teilbildblöcken und Korrektheitsmaße davon in jeder der angenommenen Zeichenrichtungen zu erhalten; und

    Identifizieren der Richtung der Zeichen in dem Bildblock unter Verwendung der Korrektheitsmaße durch:

    Definieren von mehreren Sprachgruppen, die jeweils einen Zeichensatz umfassen;

    Bestimmen einer Sprachgruppe, zu welcher die Zeichen in dem Bildblock gehören;

    Einstellen eines Korrektheitsmaßes entsprechend einem Teilbildblock, welcher einem erkannten Zeichen entspricht, das nicht zu der bestimmten Sprachgruppe in jeder der angenommenen Zeichenrichtungen gehört;

    Berechnen eines akkumulativen Korrektheitsmaßes in jeder der angenommenen Zeichenrichtungen basierend auf dem eingestellten Korrektheitsmaß; und

    Identifizieren der Richtung der Zeichen in dem Bildblock gemäß den akkumulativen Korrektheitsmaßen; wobei

    das Bestimmen einer Sprachgruppe, zu welcher die Zeichen in dem Bildblock gehören, Folgendes umfasst:

    in jeder der angenommenen Zeichenrichtungen, Berechnen des Verhältnisses von Teilbildblöcken entsprechend erkannten Zeichen, welche jeweils zu jeder der Sprachgruppen gehören, in allen Teilbildblöcken in der angenommenen Zeichenrichtung; und

    Bestimmen der Sprachgruppe als die Sprachgruppe, zu welcher die Zeichen in dem Bildblock gehören, wenn mindestens eine Bedingung erfüllt ist, wobei die mindestens eine Bedingung ist, dass das Verhältnis von Teilbildblöcken entsprechend erkannten Zeichen, welche zu einer bestimmten Sprachgruppe gehören, in allen Teilbildblöcken in der angenommenen Zeichenrichtung größer als ein erster Schwellenwert ist;

    wobei die Sprachgruppe eine lateinische Sprachgruppe, eine koreanische Sprachgruppe und eine chinesische/japanische Sprachgruppe umfasst, wobei die lateinische Sprachgruppe lateinische Buchstaben, arabische Zahlen und Nicht-Zeichen-Symbole umfasst, die koreanische Sprachgruppe koreanische Buchstaben, lateinische Buchstaben, arabische Zahlen und Nicht-Zeichen-Symbole umfasst, die chinesische/japanische Sprachgruppe vereinfachte chinesische Zeichen, traditionelle chinesische Zeichen, chinesische Zeichen auf Japanisch, japanisches Hiragana, japanisches Katakana, lateinische Buchstaben, arabische Zahlen und Nicht-Zeichen-Symbole umfasst.


     
    2. Verfahren nach Anspruch 1, wobei:

    das Korrektheitsmaß eine Konfidenz und eine Erkennungsentfernung umfasst; und

    das Einstellen eines Korrektheitsmaßes entsprechend einem Teilbildblock, welcher einem erkannten Zeichen entspricht, das nicht zu der bestimmten Sprachgruppe in jeder der angenommenen Zeichenrichtungen gehört, das Verringern der Konfidenz entsprechend dem Teilbildblock oder Vergrößern der Erkennungsentfernung entsprechend dem Teilbildblock umfasst.


     
    3. Verfahren nach Anspruch 1, wobei:

    das Korrektheitsmaß eine Konfidenz und eine Erkennungsentfernung umfasst; und

    das Einstellen eines Korrektheitsmaßes entsprechend einem Teilbildblock, welcher einem erkannten Zeichen entspricht, das nicht zu der bestimmten Sprachgruppe in jeder der angenommenen Zeichenrichtungen gehört, das Einstellen der Konfidenz oder der Erkennungsentfernung entsprechend dem Teilbildblock auf die maximale Konfidenz unter Konfidenzen entsprechend allen Kandidatenzeichen, die zu der bestimmten Sprachgruppe in der angenommenen Zeichenrichtung gehören und dem Teilbildblock entsprechen, oder eine minimale Erkennungsentfernung unter Erkennungsentfernungen entsprechend allen Kandidatenzeichen, die zu der bestimmten Sprachgruppe in der angenommenen Zeichenrichtung gehören und dem Teilbildblock entsprechen, und, wenn keines der Kandidatenzeichen in der angenommenen Zeichenrichtung, welche dem Teilbildblock entsprechen, zu der bestimmten Sprachgruppe gehört, Übernehmen der minimalen Konfidenz unter Konfidenzen entsprechend allen Kandidatenzeichen in der angenommenen Zeichenrichtung, welche dem Teilbildblock entsprechen, oder der maximalen Erkennungsentfernung unter Erkennungsentfernungen entsprechend allen Kandidatenzeichen in der angenommenen Zeichenrichtung, welche dem Teilbildblock entsprechen, als die Konfidenz oder die Erkennungsentfernung entsprechend dem Teilbildblock umfasst.


     
    4. Verfahren nach einem der Ansprüche 1 bis 3, wobei im Falle einer koreanischen Sprachgruppe oder einer chinesischen/japanischen Sprachgruppe bei dem Bestimmen einer Sprachgruppe, zu welcher die Zeichen in dem Bildblock gehören, die mindestens eine Bedingung ferner die Bedingung, dass das Verhältnis von Teilbildblöcken entsprechend erkannten Zeichen, welche zu einem Kernzeichensatz der Sprachgruppe gehören, in allen Teilbildblöcken in der angenommenen Zeichenrichtung größer als ein zweiter Schwellenwert ist, umfasst;
    wobei der Kernzeichensatz der Sprachgruppe die Zeichen der Sprachgruppe umfasst, die sich von lateinischen Buchstaben, arabischen Zahlen und Nicht-Zeichen-Symbolen unterscheiden.
     
    5. Verfahren nach Anspruch 1, wobei ein Wertebereich des ersten Schwellenwerts für die lateinische Sprachgruppe 0,65 bis 0,85 ist.
     
    6. Verfahren nach Anspruch 4, wobei:

    ein Kernzeichensatz der koreanischen Sprachgruppe koreanische Buchstaben umfasst;

    ein Kernzeichensatz der chinesischen/japanischen Sprachgruppe vereinfachte chinesische Zeichen, traditionelle chinesische Zeichen, chinesische Zeichen auf Japanisch, japanisches Hiragana und japanisches Katakana umfasst; und

    ein Wertebereich des ersten Schwellenwerts für die koreanische Sprachgruppe und die chinesische/japanische Sprachgruppe 0,65 bis 0,85 ist und ein Wertebereich des zweiten Schwellenwerts für die koreanische Sprachgruppe und die chinesische/japanische Sprachgruppe 0,45 bis 0,65 ist.


     
    7. Verfahren nach einem der Ansprüche 1-6, wobei die verschiedenen Richtungen zwei Querrichtungen und zwei Längsrichtungen des Bildblocks umfassen.
     
    8. Vorrichtung zur Identifizierung der Richtung von Zeichen in einem Bildblock, umfassend:

    eine optische Zeichenerkennungsverarbeitungseinheit, die zum Durchführen einer optischen Zeichenerkennungsverarbeitung auf dem Bildblock durch Annehmen jeweils verschiedener Richtungen als angenommene Zeichenrichtungen zum Erhalten von Teilbildblöcken, erkannten Zeichen entsprechend den Teilbildblöcken und Korrektheitsmaßen davon in jeder der angenommenen Zeichenrichtungen konfiguriert ist;

    eine Sprachgruppenbestimmungseinheit, die mit mehreren Sprachgruppen, die jeweils einen Zeichensatz umfassen, zum Bestimmen einer Sprachgruppe, zu welcher die Zeichen in dem Bildblock gehören, konfiguriert ist;

    eine Teilbildblockeinstellungseinheit, die zum Einstellen eines Korrektheitsmaßes entsprechend einem Teilbildblock, welcher einem erkannten Zeichen entspricht, das nicht zu der bestimmten Sprachgruppe in jeder der angenommenen Zeichenrichtungen gehört, konfiguriert ist;

    eine Akkumulativkorrektheitsmaßberechnungseinheit, die zum Berechnen eines akkumulativen Korrektheitsmaßes in jeder der angenommenen Zeichenrichtungen basierend auf dem eingestellten Korrektheitsmaß konfiguriert ist; und

    eine Zeichenrichtungsidentifizierungseinheit, die zum Identifizieren der Richtung der Zeichen in dem Bildblock gemäß den akkumulativen Korrektheitsmaßen konfiguriert ist;

    wobei die Sprachgruppenbestimmungseinheit konfiguriert ist, um in jeder der angenommenen Zeichenrichtungen

    das Verhältnis von Teilbildblöcken entsprechend erkannten Zeichen, welche zu jeder der Sprachgruppen gehören, jeweils in allen Teilbildblöcken in der angenommenen Zeichenrichtung zu berechnen; und

    die Sprachgruppe als die Sprachgruppe zu bestimmen, zu welcher die Zeichen in dem Bildblock gehören, wenn mindestens eine Bedingung erfüllt ist, wobei die mindestens eine Bedingung ist, dass das Verhältnis von Teilbildblöcken entsprechend erkannten Zeichen, welche zu einer bestimmten Sprachgruppe gehören, in allen Teilbildblöcken in der angenommenen Zeichenrichtung größer als ein erster Schwellenwert ist;

    wobei die Sprachgruppe eine lateinische Sprachgruppe, eine koreanische Sprachgruppe und eine chinesische/japanische Sprachgruppe umfasst, wobei die lateinische Sprachgruppe lateinische Buchstaben, arabische Zahlen und Nicht-Zeichen-Symbole umfasst, die koreanische Sprachgruppe koreanische Buchstaben, lateinische Buchstaben, arabische Zahlen und Nicht-Zeichen-Symbole umfasst, die chinesische/japanische Sprachgruppe vereinfachte chinesische Zeichen, traditionelle chinesische Zeichen, chinesische Zeichen auf Japanisch, japanisches Hiragana, japanisches Katakana, lateinische Buchstaben, arabische Zahlen und Nicht-Zeichen-Symbole umfasst.


     
    9. Vorrichtung nach Anspruch 8, wobei die Teilbildblockeinstellungseinheit zum Verringern der Konfidenz entsprechend einem Teilbildblock, welcher einem erkannten Zeichen entspricht, das nicht zu der bestimmten Sprachgruppe in jeder der angenommenen Zeichenrichtungen gehört, oder Vergrößern der Erkennungsentfernung entsprechend dem Teilbildblock konfiguriert ist.
     
    10. Vorrichtung nach Anspruch 8, wobei die Teilbildblockeinstellungseinheit zum Einstellen der Konfidenz oder der Erkennungsentfernung entsprechend einem Teilbildblock, welcher einem erkannten Zeichen entspricht, das nicht zu der bestimmten Sprachgruppe in jeder der angenommenen Zeichenrichtungen gehört, auf die maximale Konfidenz unter Konfidenzen entsprechend allen Kandidatenzeichen, die zu der bestimmten Sprachgruppe in der angenommenen Zeichenrichtung gehören und dem Teilbildblock entsprechen, oder eine minimale Erkennungsentfernung unter Erkennungsentfernungen entsprechend allen Kandidatenzeichen, die zu der bestimmten Sprachgruppe in der angenommenen Zeichenrichtung gehören und dem Teilbildblock entsprechen, und, wenn keines der Kandidatenzeichen in der angenommenen Zeichenrichtung, welche dem Teilbildblock entsprechen, zu der bestimmten Sprachgruppe gehört, Übernehmen der minimalen Konfidenz unter Konfidenzen entsprechend allen Kandidatenzeichen in der angenommenen Zeichenrichtung, welche dem Teilbildblock entsprechen, oder der maximalen Erkennungsentfernung unter Erkennungsentfernungen entsprechend allen Kandidatenzeichen in der angenommenen Zeichenrichtung, welche dem Teilbildblock entsprechen, als die Konfidenz oder die Erkennungsentfernung entsprechend dem Teilbildblock konfiguriert ist.
     
    11. Vorrichtung nach einem der Ansprüche 8 bis 10, wobei im Falle einer koreanischen Sprachgruppe oder einer chinesischen/japanischen Sprachgruppe in der Sprachgruppenbestimmungseinheit die mindestens eine Bedingung die weitere Bedingung, dass das Verhältnis von Teilbildblöcken entsprechend erkannten Zeichen, welche zu einem Kernzeichensatz der Sprachgruppe gehören, in allen Teilbildblöcken in der angenommenen Zeichenrichtung größer als ein zweiter Schwellenwert ist, umfasst;
    wobei der Kernzeichensatz der Sprachgruppe die Zeichen der Sprachgruppe umfasst, die sich von lateinischen Buchstaben, arabischen Zahlen und Nicht-Zeichen-Symbolen unterscheiden.
     
    12. Scanner, der die Vorrichtung zur Identifizierung der Richtung von Zeichen in einem Bildblock nach einem der Ansprüche 8-11 umfasst.
     
    13. Computerprogrammprodukt umfassend Befehle, welche, wenn das Programm von einer Computervorrichtung ausgeführt wird, bewirken, dass die Computervorrichtung ein Verfahren nach einem der Ansprüche 1 bis 7 durchführt.
     


    Revendications

    1. Procédé d'identification de la direction de caractères dans un bloc d'image, comprenant les étapes ci-dessous consistant à :

    mettre en œuvre un traitement de reconnaissance optique de caractères sur le bloc d'image, en supposant diverses directions en tant que des directions de caractères supposées, respectivement, en vue d'obtenir des blocs de sous-image, des caractères reconnus correspondant aux blocs de sous-image, et des mesures d'exactitude de ceux-ci dans chacune des directions de caractères supposées ; et

    identifier la direction des caractères dans le bloc d'image en utilisant les mesures d'exactitude, en mettant en œuvre les étapes ci-dessous consistant à :

    définir une pluralité de groupes linguistiques comprenant chacun un jeu de caractères ;

    déterminer un groupe linguistique auquel appartiennent les caractères du bloc d'image ;

    ajuster une mesure d'exactitude correspondant à un bloc de sous-image qui correspond à un caractère reconnu n'appartenant pas au groupe linguistique déterminé dans chacune des directions de caractères supposées ;

    calculer une mesure d'exactitude cumulative dans chacune des directions de caractères supposées sur la base de la mesure d'exactitude ajustée ; et

    identifier la direction des caractères dans le bloc d'image selon les mesures d'exactitude cumulatives ; dans lequel

    ladite étape de détermination d'un groupe linguistique auquel appartiennent les caractères du bloc d'image comprend les étapes ci-dessous consistant à :

    dans chacune des directions de caractères supposées, calculer la proportion de blocs de sous-image correspondant aux caractères reconnus qui appartiennent à chacun des groupes linguistiques dans la totalité des blocs de sous-image dans la direction de caractères supposée, respectivement ; et

    déterminer le groupe linguistique comme étant le groupe linguistique auquel appartiennent les caractères du bloc d'image lorsqu'au moins une condition est satisfaite, ladite au moins une condition étant que la proportion de blocs de sous-image correspondant à des caractères reconnus qui appartiennent à un groupe linguistique donné dans la totalité des blocs de sous-image dans la direction de caractères supposée soit supérieure à un premier seuil ;

    dans lequel le groupe linguistique comprend un groupe linguistique latin, un groupe linguistique coréen et un groupe linguistique chinois/japonais, le groupe linguistique latin comprend des lettres latines, des chiffres arabes et des symboles autres que des caractères, le groupe linguistique coréen comprend des lettres coréennes, des lettres latines, des chiffres arabes et des symboles autres que des caractères, le groupe linguistique chinois/japonais comprend des caractères chinois simplifiés, des caractères chinois traditionnels, des caractères chinois en japonais, en hiragana japonais et en katakana japonais, des lettres latines, des chiffres arabes et des symboles autres que des caractères.


     
    2. Procédé selon la revendication 1, dans lequel :

    ladite mesure d'exactitude comprend une confiance et une distance de reconnaissance ; et

    ladite étape d'ajustement d'une mesure d'exactitude correspondant à un bloc de sous-image qui correspond à un caractère reconnu n'appartenant pas au groupe linguistique déterminé dans chacune des directions de caractères supposées comprend l'étape consistant à diminuer la confiance correspondant au bloc de sous-image ou à augmenter la distance de reconnaissance correspondant au bloc de sous-image.


     
    3. Procédé selon la revendication 1, dans lequel :

    ladite mesure d'exactitude comprend une confiance et une distance de reconnaissance ; et

    ladite étape d'ajustement d'une mesure d'exactitude correspondant à un bloc de sous-image qui correspond à un caractère reconnu n'appartenant pas au groupe linguistique déterminé dans chacune des directions de caractères supposées, consiste à ajuster la confiance ou la distance de reconnaissance correspondant au bloc de sous-image à la confiance maximale parmi des confiances correspondant à la totalité des caractères candidats qui appartiennent au groupe linguistique déterminé dans la direction de caractères supposée et correspondent au bloc de sous-image, ou à une distance de reconnaissance minimale parmi des distances de reconnaissance correspondant à la totalité des caractères candidats qui appartiennent au groupe linguistique déterminé dans la direction de caractères supposée et correspondent au bloc de sous-image, et, lorsqu'aucun des caractères candidats dans la direction de caractères supposée qui correspondent au bloc de sous-image n'appartient au groupe linguistique déterminé, à prendre la confiance minimale parmi des confiances correspondant à la totalité des caractères candidats dans la direction de caractères supposée qui correspondent au bloc de sous-image, ou la distance de reconnaissance maximale parmi des distances de reconnaissance correspondant à la totalité des caractères candidats dans la direction de caractères supposée qui correspondent au bloc de sous-image, en tant que la confiance ou que la distance de reconnaissance correspondant au bloc de sous-image.


     
    4. Procédé selon l'une quelconque des revendications 1 à 3, dans lequel, dans le cas d'un groupe linguistique coréen ou d'un groupe linguistique chinois/japonais, dans le cadre de ladite étape de détermination d'un groupe linguistique auquel les caractères du bloc d'image appartiennent, ladite au moins une condition comprend la condition supplémentaire que la proportion de blocs de sous-image correspondant à des caractères reconnus qui appartiennent à un jeu de caractères de base du groupe linguistique dans la totalité des blocs de sous-image dans la direction de caractères supposée soit supérieure à un second seuil ;
    dans lequel le jeu de caractères de base du groupe linguistique comprend les caractères du groupe linguistique autres que des lettres latines, des chiffres arabes et des symboles autres que des caractères.
     
    5. Procédé selon la revendication 1, dans lequel une plage de valeurs du premier seuil pour le groupe linguistique latin est comprise entre 0,65 et 0,85.
     
    6. Procédé selon la revendication 4, dans lequel :

    un jeu de caractères de base du groupe linguistique coréen comprend des lettres coréennes ;

    un jeu de caractères de base du groupe linguistique chinois/japonais comprend des caractères chinois simplifiés, des caractères chinois traditionnels, des caractères chinois en japonais, en hiragana japonais et en katakana japonais ; et

    une plage de valeurs du premier seuil pour le groupe linguistique coréen et le groupe linguistique chinois/japonais est comprise entre 0,65 et 0,85, et une plage de valeurs du second seuil pour le groupe linguistique coréen et le groupe linguistique chinois/japonais est comprise entre 0,45 et 0,65.


     
    7. Procédé selon l'une quelconque des revendications 1 à 6, dans lequel les diverses directions comprennent deux directions transversales et deux directions longitudinales du bloc d'image.
     
    8. Dispositif destiné à identifier la direction de caractères dans un bloc d'image, comprenant :

    une unité de traitement de reconnaissance optique de caractères configurée de manière à mettre en œuvre un traitement de reconnaissance optique de caractères sur le bloc d'image en supposant diverses directions comme étant des directions de caractères supposées, respectivement, en vue d'obtenir des blocs de sous-image, des caractères reconnus correspondant aux blocs de sous-image, et des mesures d'exactitude de ceux-ci dans chacune des directions de caractères supposées ;

    une unité de détermination de groupe linguistique configurée avec une pluralité de groupes linguistiques comprenant chacun un jeu de caractères, en vue de déterminer un groupe linguistique auquel appartiennent les caractères dans le bloc d'image ;

    une unité d'ajustement de bloc de sous-image configurée de manière à ajuster une mesure d'exactitude correspondant à un bloc de sous-image qui correspond à un caractère reconnu n'appartenant pas au groupe linguistique déterminé dans chacune des directions de caractères supposées ;

    une unité de calcul de mesure d'exactitude cumulative configurée de manière à calculer une mesure d'exactitude cumulative dans chacune des directions de caractères supposées sur la base de la mesure d'exactitude ajustée ; et

    une unité d'identification de direction de caractères configurée de manière à identifier la direction des caractères dans le bloc d'image selon les mesures d'exactitude cumulatives ;

    dans lequel ladite unité de détermination de groupe linguistique est configurée de manière à, dans chacune des directions de caractères supposées :

    calculer la proportion de blocs de sous-image correspondant à des caractères reconnus qui appartiennent à chacun des groupes linguistiques dans la totalité des blocs de sous-image dans la direction de caractères supposée, respectivement ; et

    déterminer le groupe linguistique comme étant le groupe linguistique auquel appartiennent les caractères du bloc d'image lorsqu'au moins une condition est satisfaite, ladite au moins une condition étant que la proportion de blocs de sous-image correspondant à des caractères reconnus qui appartiennent à un groupe linguistique donné dans la totalité des blocs de sous-image dans la direction de caractères supposée soit supérieure à un premier seuil ;

    dans lequel le groupe linguistique comprend un groupe linguistique latin, un groupe linguistique coréen et un groupe linguistique chinois/japonais, le groupe linguistique latin comprend des lettres latines, des chiffres arabes et des symboles autres que des caractères, le groupe linguistique coréen comprend des lettres coréennes, des lettres latines, des chiffres arabes et des symboles autres que des caractères, le groupe linguistique chinois/japonais comprend des caractères chinois simplifiés, des caractères chinois traditionnels, des caractères chinois en japonais, en hiragana japonais, en katakana japonais, des lettres latines, des chiffres arabes et des symboles autres que des caractères.


     
    9. Dispositif selon la revendication 8, dans lequel l'unité d'ajustement de bloc de sous-image est configurée de manière à diminuer la confiance correspondant à un bloc de sous-image qui correspond à un caractère reconnu n'appartenant pas au groupe linguistique déterminé dans chacune des directions de caractères supposées, ou à augmenter la distance de reconnaissance correspondant au bloc de sous-image.
     
    10. Dispositif selon la revendication 8, dans lequel l'unité d'ajustement de bloc de sous-image est configurée de manière à ajuster la confiance ou la distance de reconnaissance correspondant à un bloc de sous-image qui correspond à un caractère reconnu n'appartenant pas au groupe linguistique déterminé dans chacune des directions de caractères supposées à la confiance maximale parmi des confiances correspondant à la totalité des caractères candidats qui appartiennent au groupe linguistique déterminé dans la direction de caractères supposée et correspondent au bloc de sous-image, ou à une distance de reconnaissance minimale parmi des distances de reconnaissance correspondant à la totalité des caractères candidats qui appartiennent au groupe linguistique déterminé dans la direction de caractères supposée et correspondent au bloc de sous-image, et, lorsqu'aucun des caractères candidats dans la direction de caractères supposée qui correspondent au bloc de sous-image n'appartient au groupe linguistique déterminé, à prendre la confiance minimale parmi des confiances correspondant à la totalité des caractères candidats dans la direction de caractères supposée qui correspondent au bloc de sous-image, ou la distance de reconnaissance maximale parmi des distances de reconnaissance correspondant à la totalité des caractères candidats dans la direction de caractères supposée qui correspondent au bloc de sous-image, en tant que la confiance ou que la distance de reconnaissance correspondant au bloc de sous-image.
     
    11. Dispositif selon l'une quelconque des revendications 8 à 10, dans lequel, dans le cas d'un groupe linguistique coréen ou d'un groupe linguistique chinois/japonais, dans l'unité de détermination de groupe linguistique, ladite au moins une condition comprend la condition supplémentaire que la proportion de blocs de sous-image correspondant à des caractères reconnus qui appartiennent à un jeu de caractères de base du groupe linguistique dans la totalité des blocs de sous-image dans la direction de caractères supposée soit supérieure à un second seuil ;
    dans lequel le jeu de caractères de base du groupe linguistique comprend les caractères du groupe linguistique autres que des lettres latines, des chiffres arabes et des symboles autres que des caractères.
     
    12. Scanneur comprenant le dispositif destiné à identifier la direction de caractères dans un bloc d'image selon l'une quelconque des revendications 8 à 11.
     
    13. Produit-programme informatique comprenant des instructions qui, lorsque le programme est exécuté par un dispositif informatique, amènent le dispositif informatique à mettre en œuvre un procédé selon l'une quelconque des revendications 1 à 7.
     




    Drawing




















    Cited references

    REFERENCES CITED IN THE DESCRIPTION



    This list of references cited by the applicant is for the reader's convenience only. It does not form part of the European patent document. Even though great care has been taken in compiling the references, errors or omissions cannot be excluded and the EPO disclaims all liability in this regard.

    Patent documents cited in the description