
Introduction to Discovery (6/7): The key to controlling costs lies in the quote check (Part 1)
2020/ 3/ 1
Reproduce "human judgment" with dozens of learning data
2020/ 3/ 23
A B C D E F G H I J K L M
N O P Q R S T U V W X Y Z
A Or さ It was I The Or , Et al. わ
A
Access Data
A US company that manufactures and sells FTK, a research tool.
Android
An operating system for mobile devices such as mobile phones announced by Google in 2007.
APFS (Apple File System)
A new file system introduced by Apple for the first time in 20 years, it is optimized for flash memory and SSD and focuses on encryption.
B
BIOS (Basic Input / Output System)
One of the firmware.A program that controls peripheral devices connected to a computer and performs the lowest level of input / output.
black bag
A company that manufactures and sells MacQuisition / Black Light, which is a maintenance / analysis tool for Mackintosh.
Black light
Image file analysis tool for Mac provided by Black Bag. Similar to FTK, it automatically creates a database from the acquired image data and provides an integrated analysis environment.
C
CHS(Cylinder / Head / Sector)
A method that indicates the absolute position when accessing the recording medium of a hard disk or floppy disk by using the three elements of the number of cylinders, the number of heads, and the number of sectors.
concept encoder
Artificial intelligence (AI) developed by FRONTEO Healthcare specifically for the healthcare industry.Reading: Concept encoder.It was developed with the goal of effectively analyzing and utilizing big data related to healthcare, which includes a large amount of free-form text data, based on evidence.Statistical methods such as the significance test, which is indispensable for "evidence-based medicine (EBM)", which is a common recognition of healthcare professionals, have been introduced and realized in natural language analysis.
CPU(Central Processing Unit)
It is one of the components of a computer and corresponds to the control device and arithmetic unit of the five major devices.Also called a central processing unit, it is a device that controls each device mounted on a computer and calculates and processes data.
D
DF(Document Frequency)
It shows how many documents the morpheme appeared in the document group to be analyzed.
Digital intelligence (DI)
A comprehensive discovery service company in the United States.It provides forensic-related software, hardware, training, etc., including discovery and forensic services.
E
InCase
Software that is a global standard product in forensic research technology and supports data collection and research based on forensically sound (forensically sound).In addition to data preview, data restoration, data extraction (narrowing down) that matches specific conditions, etc. are possible.
ESI(Electronically Stored Information)
Information stored electronically.electronic data.
exFAT(Extended FAT)
A file system developed by Microsoft that extends the conventional FAT32.It was developed as a file system for removable media with the purpose of supporting large volumes. Sometimes called FAT64.
ext2 / ext3 / ext4 (Extended File System)
The file system used by Linux / UNIX-like operating systems. ext3 / ext4 is a file system with a journaling function.
ediscovery(E-Discovery)
See eDiscovery System.
Email threading(Email Threading)
Organize the entire email conversation in order.Helps reduce email review time.
Email family(Email Family)
Treat the body of the email and its attachments as a group.
F
FAT:(File Allocation Table)
It is a file system that can be recognized by various OS such as Windows OS / Mac OS / UNIX (Linux), and is also used as a format format for storage such as FDD, USB memory, SD card, etc. in addition to HDD partition (format). There are FAT12, FAT16, and FAT32).
FTKMore
A global standard product in forensic research technology, software that supports data collection and research. It is also possible to link with PRTK and Registry Viewer.Data preview, data restoration, data extraction (narrowing down) that matches specific conditions, etc. are possible.
FTK Imager
Data preview and image file creation tool.In addition to browsing physically connected drives and logical data, it is also possible to expand forensic image files, calculate hash values of data, and perform simple data restoration.
G
GUIDANCE SOFTWARE
A US company that manufactures and sells EnCase, a research tool.
H
HFS / HFS +(Hierarchical File System)
The file system used by Mac OS, Apple's operating system. Journaling has been added to HFS + to improve data reliability.
I
IDE (Integrated Drive Electronics)
An interface for connecting a hard disk, which is commonly used in PCs.Due to its simple structure, the price of hard disk drives can be kept down.Currently used is a standard called "EIDE (Enhanced IDE)", which is an extension of the original IDE.
Intelligent Computer Solutions (ICS)
A US company that manufactures and sells Solo-4.
iOS
OS installed in Apple mobile devices such as iPhone, iPad, and iPod touch.It was developed based on the company's PC, OS "Mac OS X (Ten)" for Macintosh.It features a user interface that assumes a touch panel.
J
a Jupyter Notebook
A web application that allows you to create and share documents containing program code including Python, mathematical formulas, diagrams, and explanations.
K
KIBIT
The name of FRONTEO's AI.The reading is "Kibit".A coined word meaning "artificial intelligence that understands the subtleties of human beings by combining the Japanese word" subtleties "(KIBI) and the smallest unit of information," bits "(BIT).Supervised learning is performed using text data classified into correct and incorrect labels, and it is possible to automatically find common points in correct label from the features (morphemes) contained in the text.
KIBIT Email Auditor
Email audit tool.It is used to detect information leaks and cartel signs.
KIBIT Find Answer
FAQ tool.By learning the characteristics of questions entered in natural sentences, it is possible to extract similar questions in the past and quickly present the answer data of experts.
KIBIT G2
A next-generation version of the artificial intelligence engine that strengthens the originally developed artificial intelligence engine "KIBIT" and enhances the versatility of artificial intelligence (AI) implementation.Reading "Kibit GXNUMX".
KIBIT Knowledge Probe
A business data analysis support tool that analyzes text data for general purposes.Analyze text data such as daily business reports and customer inquiries.
KIBIT Patent Explorer
Patent search tool.Since it is not necessary to formulate a complicated search formula, the efficiency of the survey can be improved.It is equipped with a database of Japanese and US patent gazettes.
KIBIT-Connect
Web API that facilitates cooperation with KIBIT G2 and external systems.Data can be seamlessly collected through KIBIT-Connect, such as registering text data, creating teacher data, and providing scoring as analysis results.
L
LBA(Logical Block Addressing)
A method of indicating the data position on the hard disk by assigning a serial number to all sectors on the hard disk and specifying the sector by that number.
Linux
UNIX compatible OS developed by Linus Torvalds.Available free of charge and provided as open source, it has been improved by developers and users around the world, and many distributions have emerged.
Lit i View XAMINER / Lit i View E-Discovery
Software that allows you to search and review each data. Originally developed by FRONTEO.Documents can be scored using artificial intelligence-related technology, making it suitable for efficiently reviewing a large number of documents.It can process many mail types, has a high ability to handle mail software peculiar to Japan, and is strong in analyzing Asian languages such as Japanese, Chinese, and Korean.In addition, the Central Linkage function makes it possible to display the correlation diagram of the survey subjects in an easy-to-understand manner.
※own products
M
Mac OS
OS installed in Apple's Macintosh.
Mac Quisition
Image file creation tool for Mac provided by Black Bag. As with FTK Imager, data can be acquired on the booted machine.
MBR(Master Boot Record)
The area in the first sector on the hard disk that is first read when the computer starts up. The flow until the Windows OS or program boots and becomes operable is "PC power ON-> BIOS boot-> MBR read-> Boot loader (program that reads the OS from the HDD and boots) read-> OS boot". There is.
MFT(Master File Table)
One of the system files created for each NTFS-formatted partition.A collection of management information for the data recorded on a partition.
Micro system (MSAB)
A Swedish company that manufactures and sells MSAB Office.
MSAB Office / XRY / XAMN
Software that can extract and analyze data from mobile terminals such as smartphones. XRY (data extraction tool), XAMN (data analysis tool)
The data extracted by XRY is automatically encrypted, and it is necessary to use XAMN to analyze the encrypted data.Since the data cannot be confirmed with other analysis tools, the data cannot be tampered with.It is used by police and law enforcement agencies in more than 100 countries around the world.
* Products handled in-house
N
NAS (Network Attached Storage)
A file server used by connecting directly to the network.The housing size is relatively small, the price range is rich, and the capacity is several TB (terabyte).
NTFS(NT File System)
A journaling file system that is adopted as standard by the Windows NT system, which is Microsoft's OS. Microsoft calls it the NT File System, but it is sometimes called the New Technology File System.
O
OS
Software that provides each application with an interface that abstracts the hardware of a computer.
P
PRTK
Password analysis tool.
Python
A popular programming language in data science.It features a concise and easy-to-read grammar.From applications such as web, database, network, and concurrency to processing large-scale data.
R
RAID (Redundant Arrays of Inexpensive Disks)
It is a technology that manages and operates a single hard disk by combining multiple hard disks (HDDs), and refers to a mechanism aimed at ensuring redundancy such as faster access speed and improved safety.As a realization method, there are one that uses dedicated hardware and one that uses software.
There are seven types of RAID, from "RAID 0" to "RAID 6", depending on their function, but there are also types such as "RAID 7" and "RAID 0" that combine "RAID 1" and "RAID 01". Existing.
-"RAID0" distributes data evenly to multiple HDDs and records them in parallel. Sometimes called "striping". There is no redundancy in "RAID 1" because data will be lost if one HDD is broken.
-"RAID1" records exactly the same data on two HDDs at the same time. Sometimes called "mirroring".Redundancy is ensured because two HDDs with exactly the same data are created.
-"RAID2" records the code for error correction separately in addition to the data.Even with the minimum configuration, two HDDs are required for data and three HDDs are required for correction code.The correction code ensures that the data can be recovered even if the data HDD is broken, ensuring redundancy.
-"RAID3" records information (parity) for data recovery separately in addition to data.Record information for data recovery on one or more HDDs, and use at least three HDDs in total.Redundancy is ensured by parity.
-"RAID4" records data by the "striping" method, and also records parity separately.Record information for data recovery on one or more HDDs, and use at least three HDDs in total.Redundancy is ensured by parity.
-"RAID5" uses at least 3 HDDs and distributes the data evenly.Even if one hard disk fails, data can be recovered from the remaining disk, ensuring redundancy.
-Similar to "RAID6", "RAID5" uses at least 3 HDDs at the same time and distributes the data evenly, but even if two hard disks fail, the original data is taken from the remaining disk. Since it can be recovered, redundancy is ensured.
RECON IMAGER
Image file creation tool for Mac provided by SUMURI.It is also possible to acquire memory information.
RECON LAB
Image file analysis tool for Mac provided by SUMURI.It is also possible to analyze memory information.
Refs (Resilient File System)
Windows A file system with excellent disaster recovery introduced in 2012.
Registry Viewer
Registry analysis tool.
S
SATA (Serial ATA)
One of the extended specifications of the IDE (ATA) standard that connects a computer to a storage device such as a hard disk or optical drive.
SCSI (Small Computer System Interface)
One of the standard connection methods for connecting peripheral devices such as storage devices (external storage devices) to the computer itself for communication.
Solo-4
Hardware that can copy and erase data from HDD and USB. One-to-two, two-to-two, and one-to-one x 1 systems can be copied, and the erasing method also supports DoD wipe. It is also equipped with Windows OS and can be used as a computer. FRONTEO's Solo-2 is also equipped with FTK Imager.
SSD (Solid State Drive)
A disk drive consisting of a large amount of flash memory.It is an acronym for Solid State Drive, and is also called a silicon disk.It is used in lightweight mobile PCs and tablet PCs because it consumes less power, is lighter, and is less likely to break down.
SUMURI
A company that manufactures and sells RECON IMAGER and RECON LAB, which are Mackintosh maintenance and analysis tools.
T
TAR (Technology Assisted Review)
A general term for technologies for efficient review such as Predictive Coding, clustering, and email threading. FRONTEO's strength is Predictive Coding, which is based on the artificial intelligence "KIBIT" that was originally developed, and is working to improve the efficiency of reviews and reduce costs.
TF(Term Frequency)
It shows how many times the morpheme appeared in the document group to be analyzed.
One of the operating systems developed by Nokia Bell Labs of UNIX AT & T (USA).Development is being continued by volunteers and vendors.
U
USB memory
A general term for external storage media with a built-in flash memory that is used by connecting to the USB (Universal Serial Bus) terminal of a computer.The transmission speed differs depending on the standard, and the transmission speed differs for each generation.
USB 1.0: 12 Mbps
USB 1.1: 12 Mbps
USB 2.0: 480 Mbps
USB 3.0: 5 Gbps
USB 3.1: 10 Gbps
* All the above figures are "maximum data transfer speed"
Since the data copy speed differs due to the difference in transmission speed depending on the standard, when evidence preservation is performed on a software basis, the time required for preservation may differ significantly.
USB memory is still often used as an outflow route for cases of taking out confidential data (information leakage), and investigating the connection history of external recording devices such as USB memory is one of the effective investigation methods.
W
Windows
Microsoft OS.
Windows 9x
It is a general term for Windows 95/98/98 SE / Me, and the basic design inherits that of Windows 95.
Windows NT
A general term for operating systems created based on the basic design of Windows NT.An OS developed for servers and workstations. An OS that is upward compatible with Windows 9x in many parts, but has a completely different structure.
Wipe
Words have meanings such as wiping, wiping off, erasing, and erasing.In digital forensics, Wipe means Wipe-out (erasure of data), and there are the following three methods for erasing data.
1. Software method: Write specific numbers and letters on the recording medium
【merit】
・ HDD can be reused
【Demerit】
・ There are many methods such as zero, random number, NSA (National Security Agency), NATO (North Atlantic Treaty Organization), Gutmann (Gutmann), etc.
-Multiple writes (overwrite) are recommended in consideration of the possibility of data recovery, but processing time is required.
2. Magnetic destruction method: The recording medium is irradiated with strong magnetism to electromagnetically destroy the device itself.
【merit】
-It does not depend on the capacity, interface, or OS of the recording medium.
【Demerit】
・ HDD cannot be reused or visually confirmed.
3. Physical destruction method-Physically destroy the recording medium by making holes, etc.
【merit】
-It does not depend on the capacity, interface, or OS of the recording medium.
【Demerit】
-The HDD cannot be reused, and data remains in the undamaged part.
An example of Wipe is Wipe in advance of the data copy destination HDD in the preparatory stage of evidence preservation.This work is called sanitization, and is carried out for the purpose of avoiding any mixture with evidence preservation data without leaving minute data on the copy destination HDD.In addition, if the purpose is to prevent information leakage due to residual data on the HDD, it is essential to confirm whether Wipe was performed properly.
Z
Z-Score
A type of deviation value.An index that measures whether the mean of a sample and the mean of a population are statistically significantly different. If the Z-Score is positive, then the sample mean is significantly higher than the population mean. It is used when comparing multiple learning results of KIBIT.
A
Atony Manager(Attorney Manager)
A qualified review manager who ensures the quality of reviews by communicating with the attorney-at-law and answering questions from reviewers.
Tacit knowledge(Tacit knowledge)
Knowledge based on experience and intuition that is difficult to express in words.
Imaging(Imaging)
Convert files for submission to pages, especially in TIFF or PDF format.
Incident response
Incident response refers to measures and responses after an accident (incident) occurs in an information system or the like.An accident (incident) in the field of information security refers to an event that poses a threat to information systems and the like in general, and is also called a security incident.
Recent incidents include not only system-focused incidents such as unauthorized access and system failures, but also incidents in moral hazard such as window dressing, insider trading, and cover-ups.
If an accident occurs in an information system, etc. in corporate activities, etc., it may lead to a decrease in corporate value.One of the purposes of incident response is to control and minimize risks such as data loss, service deterioration, and corporate value reduction.
For incident response, it is important to understand the incident at an early stage and take the initial response "quickly," "accurately," and "smoothly."In order to carry out appropriate initial response and investigation, it is important to establish procedures and systems for responding to incidents during normal times and to conduct regular training.
In the investigation when an incident occurs, there are cases where preparations and system construction are not completed, or the process for problem solving is unknown.It is also effective to ask a research company, etc., which is in a third-party position, from the viewpoint of the reliability and credibility of the survey.Along with this, the selection of a digital forensic research company that actually conducts the survey is also an important point.It should be decided based not only on the price but also on the comprehensive ability including experience and quality.
Interview(Interview)
An interview with a lawyer or the Legal and Intellectual Property Department should be conducted on the subject of the proceedings (Custodian).
interface (Interface)
It is a term that indicates "boundary" or "contact point", and in the IT field, it mainly means "contact point between hardware, or devices and programs that are contact point".Specifically, it refers to the shape and specifications of the physical connection required for communication between hardware such as a computer and its peripheral devices.
Interfaces can be divided into two types according to the information communication method, and there are "serial interface" that transmits data bit by bit in order and "parallel interface" that transmits data multiple bits at a time.
"SATA" (Serial Advanced Technology Attachment) and "IEEE 1394" are the standards that correspond to the former, and "SCSI" (Small Computer System Interface) and "IDE" (Integrated Drive Electronics) are the standards that correspond to the latter. , "ATA" (Advanced Technology Attachment) and the like.
The interface is one of the important considerations that influences the availability and speed of data copying in evidence preservation.For example, when copying data directly from the HDD, the connection format differs depending on the interface, so it is necessary to prepare a connection cable according to each standard in advance.Moreover, since the data copy speed depends on the interface, it is desirable to know the data transfer speed of each standard.
index(Indexing)
[Broad definition] Creating an index for smooth search on the tool (Lit i View). (* Note: Assigning a unique number (Doc. ID) to the document and extracting meta information)
[Narrow sense] Create an Indexing database after Text Extraction so that keyword search can be performed.
Browse / Review(Review)
FRONTEO reviewers, legal / intellectual property managers, and lawyers should visually confirm whether the electronic data that has been collected, processed, and analyzed is related to the proceedings.
Endorsement(Endorsement)
Engrave the serial number and the level of confidentiality of the information on the file.
Or
Write protection device
A device that prevents data from being written to a recording medium such as an HDD.
In forensic investigations, in order to maintain the integrity and originality of the evidence data, when accessing the data to be investigated and analyzed, it is necessary to access it read-only and prevent the data from being written.Since the write protection device can invalidate the writing of any data to a recording medium such as an HDD, it is possible to avoid data modification and safely browse the data.
extension
The extension is the character string added to the end of the file name to identify what the file is and which application can open it.
The character after "." (Period) at the end of the file name is the extension, and the Windows OS determines the file type based on the extension.Examples of extensions include ".xls" (Excel document), ".jpg" (jpeg image), and ".pdf" (PDF document).
Since the extension is hidden by default in Windows, you need to change the setting to refer to it.
In fraudulent cases, there are cases where the extension of evidence data is changed to hide it. (Change the extension of the word file ".doc" to the image file ".jpg", etc.)
In addition, some computer viruses try to trick the user into executing the virus file by disguising the appearance of the extension. If the extension is not displayed, the executable file containing the virus " Since "Sample.txt.exe" is displayed as "Sample.txt", the user may think that it is just a text file and execute it.
Surcharge reduction and exemption system(Leniency)
A system that reduces punishment by voluntarily applying for and reporting the fact of violation to the Fair Trade Commission as a whistleblower when committing an act that violates the Antimonopoly Act, such as a collusion act (cartel) that oneself is involved in. ..
Proof of concept(PoC: Proof of Concept)
Demonstration experiment.Investigate / verify the effectiveness of the product using your data before putting the product into production.
Write protection device
A tool that can prevent data from being written in order to maintain the originality and integrity of the evidence data that is important in forensic investigations.Disables data writing to media of various standards such as connected HDDs and USB interface devices.
* Products handled in-house
Extended partition(Extended partition)
A DOS partition other than the basic area, which is a partition that divides the hard disk into several parts.
Custodian(Custodian)
Holders of related data and documents.Be the target of information disclosure.
Culling(Culling)
Select certain files except unnecessary files (system files, etc.) before submitting to court or reviewing by a lawyer. FRONTEO uses EnCase to extract only the necessary data (files created by User).
Storage device(Memory)
One of the five major devices.The main memory inside the computer is called the main storage device, which is a storage device that the CPU directly exchanges with, and information is lost when the computer power is turned off.A recording medium such as a hard disk or USB memory that does not lose data even when the power is turned off is called an auxiliary storage device (secondary storage device) for the main storage device, and data created by the user or OS is recorded. ..
Machine learning(Machine Learning)
Iteratively learn from the data and find the patterns hidden in it.Then, by applying the learned results to new data, it becomes possible to predict the future according to the pattern.Algorithms implemented by manual programming can be automatically constructed from a large amount of data, so they are applied in various fields.When performing machine learning, it was necessary for humans to adjust parameters in advance from the data so that they could be easily learned.However, even the parameter adjustment can be performed automatically in recent years.
Supervised / unsupervised learning(Supervised Learning / Unsupervised Learning)
The method of having a computer learn the input / output relationship (function) hidden behind it by using training data consisting of a set of input (question) and output (answer) is called "supervised learning".On the other hand, the method of learning only from input data without output (answer) is called "unsupervised learning".For example, clustering that groups people with similar input data is a typical example.
Teacher data / training data(Training Data)
Data for supervised learning to let a computer learn how to classify data. KIBIT teacher data consists of data with two types of classification labels: HOT (data that you want to discover) and NOT HOT (data that you do not want to discover).
Business Specifications / Scope of Work(SOW: Scope of Work)
A document that defines and describes the scope of work required for a project, such as selecting the scope of maintenance. (* Note: This document is also used at the Process stage to show various conditions of Process (type of Dedup, etc.))
Client-server method(Client-server system)
A system in which a server provides information and functions in response to a request from a client computer by connecting a server with a large amount of information and a client computer via a network.For example, by connecting a computer and NAS via a network, it is possible to download necessary files from the NAS.
cluster(Cluster)
A cluster is the smallest unit of data managed by the OS.It is a collection of several sectors, which are the minimum units for writing data, and the number and size of sectors differ depending on the OS and file system.The reason why the OS manages in cluster units is that it takes time to read and write in sector units, which is the minimum unit for writing data, and efficiency deteriorates. Therefore, efficiency is improved by handling a certain number of sectors as a cluster.
For example, if the cluster size is managed by 4K bytes (4,096 bytes), when 1K bytes of data is written on the OS, the size on the disk is displayed as 4K bytes (4,096 bytes).This means that the data itself is 1 Kbytes, but the minimum unit of data managed by the OS is 4 Kbytes (4,096 bytes), so that amount of area was used.In addition, the extra area in which this data is not written is called a slack space.In digital forensics, the residual data left in this slack space may be available for restoration and analysis.
Clustering(Clustering)
A general method for classifying data according to similar characteristics is shown.It is roughly divided into unsupervised clustering, which classifies data only by looking at it, and class classification, which classifies data by referring to labels.
plaintiff(Plaintiff)
The party who filed the proceedings in the proceedings.
Optical character recognition(OCR: Optical Character Recognition)
A device that reads handwritten or printed characters, compares them with data, determines the characters, and converts them into electronic text.Files that do not hold text cannot be Text Extracted, so they may be converted to text by OCR.
Explicit knowledge(Explicit knowledge)
Knowledge that can be explained and expressed by sentences, charts, mathematical formulas, etc.
morpheme(Morpheme)
The smallest unit of words that has meaning. As an example of decomposing the explanation of ← into morphemes, "meaning / has / minimum / / word / / unit" can be considered (results differ depending on the dictionary and method used for decomposition).
Case study sheet(CSS: Case Study Sheet)
The location of data related to the project, maintenance method, maintenance date and time, etc. are described.
Language identification(Language Detection)
Identify the language (or proportion) used in each document.
coding(Coding)
Tagging.For determination on the evidence viewing system (Lit i View in FRONTEO) to add information such as whether the data is a document related to proceedings and what kind of document it is related to. Tag it.
5 major computers
A computer is composed of a combination of various parts, but it is the name when each function is classified into five.Control device, arithmetic unit, main / auxiliary storage device, input device, output device.
さ
server(Server)
A computer installed on a network that provides service functions and data in response to requests from users (client computers).
Recall(Recall Rate)
The percentage of the correct answer data that you want to retrieve from the evaluation data group that you could actually retrieve.Indicator of completeness. It is used to evaluate the learning results of KIBIT.
Creation / Production(Production)
Converting electronic data that is classified as related to proceedings in a review / review into a format for submission to court.
Fact hearing(Trial)
A trial held in a public court when a summary judgment or summary judgment cannot be settled.
collection(Collection)
To collect all targeted electronic data. (Information that remains only on paper may be digitized and collected by scanning.) Data is duplicated. (* Note: FRONTEO calls Collection in EDRM "preservation of evidence")
Subpoena(Subpoena)
An order issued to encourage the appearance of a court.
Take testimony(Deposition)
Cross-examine witnesses outside the courtroom with a lawyer and record the contents.
Discovery(Discovery)
A procedure in which both the plaintiff and the defendant disclose evidence related to the proceedings before the "trial" in the US civil proceedings.Even if an American subsidiary is subject to proceedings, the data from the Japanese headquarters will also be subject to discovery.
Information disclosure support company(Discovery Vendor)
A company that supports discovery. FRONTEO is also one of the information disclosure support companies.
Information management(Information Governance)
Information management in peacetime such as classification and storage of electronic data performed by companies. In FRONTEO, it is used in the archive function of Email Auditor.
Evidence preservation (preservation of evidence)
Generally, it refers to securing evidence to be used in civil and criminal proceedings.
In digital forensics, evidence preservation mainly refers to the work of obtaining a complete copy (copy of the entire area of the HDD, etc.) without rewriting the data of the HDD, etc. in the target PC at all.
Properly preserved evidence can be valued as evidence similar to the original, but for that purpose the credibility of the evidence preservation work and the identity of the original and duplicate information must be ensured.
In order to ensure the credibility of the work, it is necessary to record such as "keep a record of the work procedure in a document" and "take a picture of the work situation with a camera / video", and the original and duplicate data are the same. In order to ensure the property, it is necessary to obtain the hash values of the original HDD and the duplicate HDD and verify the identity by comparing the calculated hash values, both of which are focused on objectiveness and third-party reproducibility. Work must be done.
Processing / process(Processing)
Pre-processing to extract compressed files such as e-mails and extract text and metadata in order to analyze and browse the collected electronic data.
Cylinder(Cylinder)
One of the recording units on a hard disk.On the platter, data is recorded in sectors divided into concentric tracks, and a cylindrical collection of tracks is called a cylinder.
Artificial intelligence(AI: Artificial Intelligence)
A general term for technologies that artificially realize human intelligence with machines. "Artificial intelligence" itself is a very broad concept and has a high degree of abstraction.The definition is still the subject of research.It is a general term for technologies and methods that attempt to realize functions similar to the learning abilities that humans naturally perform on computers, and FRONTEO's artificial intelligence technologies Landscaping and Deep Learning are typical examples.
Ink painting(Redaction)
Privilege or the process of partially inking confidential information.
Sanctions(Sanction)
Failure to present data in the discovery will result in fines and other legal action.
Second review(2nd Review)
A more advanced review after the 1st review.Lawyers and paralegals sort out whether or not the materials are related to proceedings.It also checks the Privilege.
sector(Sector)
The minimum recording unit of data in a disk-shaped recording device.
On the surface of a disk-shaped disc, there is a region called a track divided into concentric circles, and the portion obtained by dividing the track into several parts and forming a fan shape is called a sector.The size of a sector varies depending on the file system, etc., but NTFS is generally set to 1 bytes per sector.In addition, with the recent increase in the capacity of hard disks, 512K sectors (4 bytes per sector) have also appeared.
A sector is the smallest unit for writing data, but the smallest unit for managing data in a file system is a unit called a cluster, which is a collection of several sectors. The NTFS cluster size is generally 4,096 bytes, which is an area for 8 sectors.The reason why it is not managed on a sector-by-sector basis is that if the OS reads and writes data on a sector-by-sector basis, it becomes inefficient.
In digital forensics, it is possible to investigate the residual data of slack space from the sector size and cluster size.
Proceedings support(Litigation Support)
A system in which IT specialists are set up in a lawyer's office and legal technology is used.
Early case evaluation(ECA: Early Case Assessment)
Estimate the risks (time and financial costs) to prosecute or defend in a proceeding.
It was
Third party committee (Third-party panel)
A third party committee is an investigative committee composed of neutral third parties who have no direct interest in the parties in conflict.It is installed when a scandal that has a great impact on public opinion or a case that requires fact-finding occurs.
The third-party committee consists of a team of members with specialized knowledge to clarify the cause of the case and make recommendations for recurrence prevention measures.
Most of the information held by companies and government offices is electronic data such as electronic files and e-mails, so if any scandal occurs, it may be necessary to analyze the electronic data.However, since electronic data is highly volatile and easy to modify, high technology related to digital forensics is required to investigate and analyze without losing evidence.
In recent years, the investigation of electronic data by third-party committees has increased, and forensic vendors have participated as an aid to the third-party committee or the electronic analysis part of the third-party committee to clarify the cause and objective facts. Opportunities to prepare the described reports are increasing.
Chain of custody / continuity of evidence(CoC: Chain of Custody)
A document certifying the continuity of storage. At FRONTEO, the target data is "recorded when, who, from whom, to which HDD and brought back, and proves the continuity of storage. This document is confirmed and signed by the customer after data preservation. store.
Duplicate deletion(De-duplication, Dedup)
Global Dedup, which prioritizes and deletes duplicates of custodians when the same email is received within multiple custodians, and prioritizes File types and target devices within one custodian. There is a Custodian Dedup that dedups.By deleting duplicates, it is not necessary to review the same email, which leads to a reduction in review time.
Request for proposal(Request for Proposal)
A document that describes the scope of work required for discovery and eDiscovery.A proposal created based on the Scope of Work.
Submission(Presentation)
Submit materials prepared in accordance with EDRM to public hearings and trials in accordance with legal procedures.
Deep learning / deep learning(Deep Learning)
It is a multi-layered machine learning method modeled on human nerves, and features that have been manually set by researchers and engineers of each data such as images and sounds are automatically calculated.A multi-layered network structure is used to learn the characteristics of various sizes inside the data.However, the time required for learning and the amount of training data are much larger than before.
Discovery(Discovery)
See Discovery.
directory(Directory)
It refers to the storage location of data and files, and is called a "folder" in Windows.It is possible to create directories in a hierarchical structure (tree structure) and save data and files.
Data collection sheet(Data Collection Sheet)
A sheet that describes the preserved Evidence information after data collection.
Compliance rate(Precision Rate)
The ratio of the actual correct answer data included in the information extracted by artificial intelligence as correct answer data.An index related to accuracy. It is used to evaluate the learning results of KIBIT.
Text extraction(Text Extraction)
Extracting text from real data for keyword search.
Digital forensics (Digital forensics)
The Digital Forensics Study Group "preserves evidence, investigates, and analyzes electromagnetic records in response to incidents, legal disputes, and litigation, and analyzes and collects information on falsification and damage to electromagnetic records. It is defined as "a series of scientific research methods and techniques". (Source: Digital Forensics Study Group, "Revised Digital Forensics Encyclopedia", 2014)
Also, if limited to computer forensics, a complete copy (copy of the entire HDD area) is created in a format (image data) that is difficult to tamper with or change without rewriting the data in the HDD to be investigated (evidence). It refers to investigating and analyzing the data in the duplicated HDD after (maintenance). It also includes restoring, investigating and analyzing the part of the HDD where the data remains, although it has been erased on the OS.
Electronic discovery system(E-Discovery)
Discovery of evidence in US civil proceedings that targets electronic data. A US litigation system that was initiated by the December 2006 revision of the law (FRCP: Amendment of the Federal Rules of Civil Procedure) and requires the disclosure of internal electronic data such as emails and drawings related to the parties to a civil procedure.
Electronic information disclosure reference model(EDRM: Electronic Discovery Reference Model)
International standard eDiscovery workflow.
Amount of transmitted information / amount of mutual information(Trans information / mutual information)
A quantity that represents the measure of interdependence between two random variables. KIBIT learns by calculating the amount of transmitted information based on the TF / DF of teacher data.The value of the amount of transmitted information is higher for morphemes that frequently appear only in correct answer data and only incorrect answer data, and KIBIT learns morphemes that frequently appear only in correct answer data as the characteristics of correct answers.
serial number(Bates Number)
A serial number assigned to the file for submission converted to the specified format.
specific(Identification)
Identify the subject of the proceedings (Custodian) and the location of all potential electronic data. FRONTEO is present at the interviews required for the specific cases and provides support from a technical point of view as to whether or not maintenance and collection is possible.
truck(Track)
One of the recording units in a disk-shaped recording device.In a recording medium using a magnetic disk such as a hard disk or a floppy disk, data is divided and recorded concentrically like an annual ring of a tree, and the concentric area is called a track.
The
partition(Partition)
A logically divided area of the hard disk storage area.
Hash value(Hash Value)
A hash function is a function that converts arbitrary-length data “x” to fixed-length data “h (x)”.The value obtained by this hash function is called the hash value.
Hash values correspond to human DNA and fingerprints, and have extremely high uniqueness.In the forensic industry, the identity of this hash value ensures that the data has not been tampered with or altered.
Hash functions have two major characteristics.One is that it is very difficult to back-calculate “x” from the value “h (x)” obtained by the hash function.The other is that even if “x” is given, it is difficult to find “y” such that “h (x) = h (y)”.Therefore, the hash value obtained by the hash function has a very high uniqueness and can be a value that guarantees the identity of the data.
Typical examples of hash functions are MD5, SHA-1, SHA-256, etc., but MD5 has been found to be vulnerable, and in Japan, the SHA-2013 algorithm was also vulnerable in 1. As a result, SHA-1 is included in the operation monitoring algorithm list from the electronic government recommended algorithm list.
MD5 has a method of calculating the same hash value from different data A and B, but since it is not easy to reproduce a data string from an arbitrary hash value, MD5 is still the same data in the forensic industry. It is used for sex verification.
Cyclic Redundancy Check (CRC) is sometimes treated as a type of hash value, but since it is a type of error detection code, it is not resistant to data tampering.
defendant(Defendant)
The party in the proceedings in the proceedings.
Confidentiality(Privilege)
An example is communication about legal advice between a company and a lawyer.At the review stage, electronic data subject to confidentiality privileges must be excluded from the subject of presentation.
Evaluation data(Evaluation Data)
A group of data for evaluating learning results.By setting data different from the teacher data as the evaluation data, the versatility of the learning result can be evaluated.
standard deviation(Standard deviation)
An index that indicates the amount of data variation. (Example: The information "a company with an average annual income of 500 million yen" does not tell whether a specific employee's annual income is extremely high or the employee's salary is close to 500 million yen on average. Use the standard deviation. By doing so, it is possible to grasp the range of annual income of all employees.)
First review(1st Review)
First review.We will mobilize many reviewers and sort out a large amount of data. Most of the reviews conducted by FRONTEO are based on this review.
File system(File system)
A method in which the OS efficiently manages the data stored in a recording medium such as a hard disk. A mechanism that allows you to refer to, create, and delete various data from the OS.
File slack space(File slack space)
The space that became an unused part in a sector when the data was recorded in the sector.
Boot sector(Boot sector)
The sector on the hard disk that contains the program that calls the OS when the computer starts.
format(Format)
Initial area for storing management information such as "file system identifier", "partition information", "directory and file names and configuration information" of electronic storage media such as HDD, USB, SD card (hereinafter, management area) Refers to recording a value and making it ready for use.
If the storage medium in use is formatted, the management area will be initialized, and the saved data will not be visible on the OS.
When formatting is performed using a Windows PC, not only the management area is initialized, but also bad sectors are diagnosed at the same time.In addition, by selecting the optional "Quick Format", it is possible to only initialize the management area without diagnosing bad sectors, and formatting can be completed in a shorter time than normal formatting. it can.
In digital forensics, when investigating an electronic storage medium whose management information has been initialized by formatting, data can be saved because the area where data can be saved may not be overwritten even if it is formatted. If there is residual data in a certain area, it is possible to restore the data using a forensic tool, restoration software, or the like.
Forensic copy (Forensic copy)
Copying all data areas such as HDD and USB memory.
When copying files by copy and paste on Windows, even if the existing data that is visible on the OS can be copied, the area where deleted data may exist or the unused area can be copied. Can not.
On the other hand, in forensic copy, by using a dedicated device or software, all data areas including unused areas and unallocated areas where deleted data may exist, data management information, etc. are copied. can do.Therefore, it is possible to restore the deleted data based on the data acquired by the forensic copy.
There are two main methods for forensic copying.One is a copy called "2% physical copy" that creates a complete clone, and the other is a copy called "image file copy" that obtains the target data by dividing it into a certain capacity in the image file format. is there.Although they look different from each other in appearance, they differ only in the copy format, and the contents obtained as a result of copying are the same.
As described above, in the forensic copy, since it is possible to acquire the deleted data area and the hidden data area that cannot be normally acquired, it is possible to secure more information.In this respect, it can be said that forensic copying guarantees "usefulness".
Primary partition(Primary partition)
A partition that can be specified as a boot drive, and up to 1 primary partitions can be created on one HDD.
Platter(Platter)
A disk-shaped part coated with a magnetic material that is stored in a magnetic disk medium such as a hard disk and on which data is recorded.As the material, aluminum, ceramic, glass, etc. are used.
Predictive coding(Predictive Coding)
A type of TAR.Ability to score data based on the results of human sample reviews.The machine strongly supports the conventional human review of litigation-related documents, shortening the review time and improving efficiency, and reducing the review cost. (* Note: This function is installed in the company's product Lit i View. The data score is from 0 to 10000 points, and the higher the score, the more relevant the data is considered.)
Bad sector(Bad Sector)
A sector in which disk access is no longer possible due to physical damage to the disk.
analysis(Analysis)
For electronic data, narrow down and classify by keywords while considering the priority of browsing.
head(Head)
A small component in a hard disk that reads and writes data on a platter.If data is recorded on both sides of the platter, one head will be mounted on each side.
Hosting(Hosting)
The files extracted by the process work are uploaded to the online evidence viewing system and built as a database for viewing.
preservation(Preservation)
Protect electronic data from being improperly tampered with or destroyed. (Example: Easy Hold function of in-house product Lit i View: Sending Litigation Hold Notice email to Custodian, taking questionnaire related to proceedings)
Or
mount(Mount)
Make the computer recognize and operate peripheral devices.
Motherboard(Mother board)
One of the computer parts. A board on which other parts such as CPU and RAM modules are mounted.
Malware (malware)
It is a general term for software created for the purpose of operating computers illegally, and includes Trojan horses, computer viruses, worms, adware, backdoors, spyware, and so on.
Malware behaviors range from malicious ones such as computer hijacking and data theft / destruction to those that frequently display unwanted advertisements on the browser.
Anti-malware measures include firewall settings, installation of anti-virus software, keeping the OS and applications up-to-date, and not opening suspicious attachments in emails.
In digital forensics, there are investigations for the presence or absence of infection, investigations such as whether unauthorized operations have been performed due to infection, identification of the range of infection, and investigation of infection routes.
Email archive(Mail Archive)
It is to securely store email data in a dedicated storage area, and multiple email messages are stored in one Mail Archive.Unlike backup, it is not for restoration, but for the purpose of reducing data capacity and storing data.Examples include Outlook pst, Outlook Express dbx, and ThunderBird msf.
Meta information(Meta Data)
Information that accompanies the data.Examples include "date and time of creation", "creator", "data format", "title", and "annotation" of data. (Note: The "creation date and time" here is the "creation date and time" managed by the file itself, and is different from the "creation date and time" information managed by the file system.
Character code(Character code)
A number uniquely assigned to each character to display the characters on a computer. Also called "character code". There are systems such as ASCII code, Shift JIS code, and Unicode.
Et al.
Religation hold(Litigation Hold)
An obligation to maintain electronic documents that is imposed when a proceeding occurs or when the possibility of a proceeding becomes clear.Priority is given to document management regulations within the company.If the related electronic document is deleted during this maintenance obligation period, it will be regarded as an act of destroying evidence and severe sanctions will be imposed.
Liniency(Leniency)
See the surcharge reduction and exemption system.
Summary judgment(Summary Judgment) To terminate a proceeding before a trial at the court's discretion, without the need for a jury verdict.
Loose file(Loose file)
Multiple documents are not stored in one file like compressed files such as Mail Archive, ZIP, LZH, etc., but are stored in the state of a single file or a single mail.
Registry(Registry)
It is a database used in Windows OS after Windows 95, and records basic Windows information, system, and application software setting data.
Reviewer(Reviewer)
At FRONTEO, he is mainly in charge of the 1st review.Use your own products, Lit i View and Relativity (competitor products), to sort out whether the data is related to litigation (tagging and coding the data).
Review protocol(Review Protocol)
A runbook that describes the background of the review and the criteria for tagging.Use it as a guide for reviewers to conduct reviews.
Review manager(Review Manager)
The person who manages the reviews.Manage the quality and progress of reviews and report to lawyers and project managers.