 This is like a pattern recognition where you take the whole story of the article (can check News1.csv, column E, ‘content’). Then, the Features Extraction Programme (Python) can recognise and take only important keypoints such as: 1) Criminal Case e.g theft, homicide, gambling, smuggling etc (in crime word dictionary.txt) 2) Location e.g Kg Mentiri, Beribi, Limau Manis etc (in location.txt) 3) Date e.g September 13, 2019. August 28 4) Employed/Unemployed: Based on the content if the suspect is employed as what or they are unemployed 5) Gender: Usually, for Malay names.. When the name has ‘Bin’ means the suspect is a Male. When the name has ‘Bte’ means the suspect is a Female. Kasim bin Omar = Male Ahmad Razali bin Ismail = Male Siti Noraliza bte Haji Abdullah = Female 6) Age: depends what was written in the article content e.g 36, 43 years old etc

