Reproduce "human judgment" with dozens of learning data
2020/ 3/ 23Introduction to Discovery (6/7): The key to controlling costs is the estimate check (Part 2)
2020/ 4/ 1Turn text data into knowledge
With KIBIT, you can learn with a small amount of teacher data
It is easy to think that artificial intelligence requires thousands to tens of thousands of data, but with KIBIT, you can start with dozens of data.
Let's think about the direction of artificial intelligence utilization while looking at the mechanism and process of the characteristics of analysis when using FRONTEO's artificial intelligence KIBIT.
FRONTEO's artificial intelligence engine KIBIT analyzes text with a proprietary algorithm that is a type of machine learning. What we aimed for when developing KIBIT was "small, light, and highly accurate." "Less" is the amount of data to learn.No matter how good AI is, if it takes a year to collect data and learn, it cannot be said to be a practical and practical AI.In addition, analysis requires a high-performance processor, which increases the cost of use and takes time to build the environment, which is also not practical or practical.
Next, in order to obtain "accuracy", it is necessary to give data to artificial intelligence and train it. At KIBIT, in the initial stage of analysis, in the learning phase, information that you think is valuable is assigned to "I want to find it", and information that you do not have is assigned to "You do not have to find it".This distribution is the characteristic of KIBIT, and it gives speed and accuracy with less learning.It is important to note that instead of "selecting keywords", the documents are sorted by document, for example, one sheet of A4 paper or one email.Because KIBIT looks not only at words, but at the composition of letters used in sentences, that is, the entire context.
Now let's take a look at the actual analysis process performed by KIBIT.Below is an example of looking for evidence of companies trying to "collusion" with each other in an investigation to discover corporate misconduct (Fig. 1).A rigging is often preceded by a "secret talk".In order to find the signs of such secret talks, classify the following e-mail sentences into "I want to find" or "I don't need to find" and tell KIBIT.The red vertical line is the first process of natural language processing called "morphological analysis" that divides sentences into the smallest units.Comparing this "I want to find" and "I don't have to find", at first glance there seems to be no big difference between the two emails, but KIBIT finds the difference instantly.For example, "izakaya" and "drinking" are found in both sentences, but "private room", "from last time", and "time has passed" are only found in the former. KIBIT gives high scores to words and sentences in "want to find", and gives low scores to words and sentences in "don't have to find".An example of weighting according to the score based on the calculation formula of the proprietary algorithm is shown below (Fig. 2).
Figure 1. Example email when looking for evidence of collusion
Figure 2. Differences in weights and calculation formulas compared between private rooms, drinks, and izakaya (partial)
KIBIT learns the multidimensional combinations of "I want to find" and "I don't have to find" in the text, and then performs an analysis to discriminate a large amount of given data (Fig. 3).For example, if you stack 4 sheets of A1 size paper, the height will be about 1 meter.It takes a huge amount of time to search for this one by one with the human eye, and if a large number of people divide it by hand, omissions and mistakes will occur. If you let KIBIT learn the document data that you "want to find" and "do not need to find" and perform text data analysis, the analysis of 1 sheets will be completed in about three and a half minutes.The analysis results will be sorted by scoring in the order similar to the sentences learned "I want to find", and a large number of documents that were previously disjointed will be sorted in order of high priority "I want to find". ..
Figure 3. KIBIT analysis flow
The most important factor in improving the accuracy of analysis is the teacher data that KIBIT learns to "find". In the example of using KIBIT, experts and experienced people in various fields have experience and feelings such as "this email is suspicious" and "this kind of answer is a sales opportunity", so-called tacit knowledge, and emails, daily reports, and customer feedback. Simply select a document such as, and it will be the teacher data you want to find.Also, even if you do not have the knowledge of an expert, if there is a document when the fact that you "want to find" occurred in the past, that is also good teacher data.In this case as well, by teaching the entire document, not by keywords, KIBIT will analyze the sequence of words and capture the characteristics.Then, KIBIT can bring out the things that people "want to find" that they didn't notice.Learning artificial intelligence is more effective if it has a clear purpose and perspective.Some people may think that the more teacher data you give, the higher the accuracy, but in reality, if the amount of data is too large or if extra information is included, the accuracy will decrease. .. The key to improving accuracy is to focus on records based on the “want to find” perspective and facts that have occurred in the past.
An example of using KIBIT using such a mechanism is summarized in <Fig. 4>.
Sales Sale | ・ Extraction of order opportunity / loss of order risk ・ Compliance violation check | human resources | ・ Prevention of human resources outflow / harassment ・ HRTech (evaluation, placement, recruitment, etc.) |
Marketing Strategy | ・ Analysis / utilization of customer feedback ・ Efficiency of market / competitive research / technical research | Customer support | ・ Discovery of hidden complaints ・ Extraction of outbound calls that are likely to be closed |
製造 Development / IP | ・ Efficiency of market / competitive research / technical research ・ Article search ・ Technology development Q & A support ・ Patent search / analysis / IP strategy | Legal affairs compliance | ・ Prevention and countermeasures for information leakage ・ Cartel / Antitrust Law Measures ・ Fraudulent accounting, bribery, FCPA measures ・ Check for conflicts with the landscape method |
Figure 4. KIBIT utilization area
Is there any business in your company that "checks a large amount of records with your eyes and ears"? By using KIBIT, you can comprehensively see the records that could only be partially checked so far.Also, even if humans convey the same thing, the usage and expressions of the words are various, and it may not be possible to extract "what you want to find" just by entering a few "keywords".
At first glance, even a series of ordinary words can be used as a sensor to discover human behavior by giving KIBIT viewpoints and facts as teacher data.For example, if your boss asks someone who is worried about leaving a job, "Are you okay?", Many will not immediately say "I want to quit," but will say "I'm okay." With KIBIT, you can grasp the characteristics from the interview records of people who have left their jobs in the past and find out the behavior of people who are different from the meaning of the words.In this way, KIBIT can provide both "quantity and quality" of large amounts of data analysis and finding accuracy.
Have you got a little image of using artificial intelligence?Artificial intelligence "KIBIT" that can be immediately touched because it is familiar text data.Furthermore, the analysis language supports not only Japanese but also English, Chinese, and Korean, and its use in business solutions is expanding not only in Japan but also overseas.Please put your text data and the viewpoint you "want to find" into KIBIT.