Blog

 

 

The database management team, part 1: Cogs in the wheel of big data

Airline travellers giving signs of relief on landing at JF Kennedy, Heathrow, Orly or any other airport will hardly know that a database analyst at A Data Pro has had a part in making their flight safe. Compiling data for terrorists watchlists is only a small bit of our team’s work.

datateam

We talked to some people from the database management team about their experience of collecting and analysing loads of data from media reports, sanction lists, agency and police records. Their projects are varied – media contacts databases, adverse media screening, articles matching, transliteration of names, website evaluation, etc.

Taming the data tide. How it works?

IvailoPanterov1

Most of the raw information we catalogue and scrutinise turns into neat watchlists of companies and individuals to be avoided in business transactions because of their past history of fraud, embezzlement, bribery, etc, says Ivaylo Panterov, project manager of the scrutinized companies list team, showing a spreadsheet of entities (organisations, companies, individuals), crimes, verdicts, penalties.

“You name the crime, and we have it. We often come across funny records of petty crime such as an Indian guy stealing a goat or a smuggler caught by border police with 200 small lizards in his pants,” Ivaylo smiles.

Where do they get the information from? Ivaylo opens a long list of sources – financial and antitrust regulators, police departments, defence ministries, revenue agencies, banks, anti-money laundering offices, anti-corruption agencies, etc. Why banks? “For delinquent borrowers”, the project manager explains.

“Sometimes it’s like looking for a needle in a haystack. You only have a name and need to make a lot of guesses and checks to validate it,” Ivaylo adds.

Tsveta1

Help in identifying criminals may come from the least expected quarters…the social media, analyst Tsveta Asenova notes. “We have read reports about suspects checking themselves in Foursquare and Facebook on the crime scene and thus caught by the police.” Tsveta is proficient in Norwegian and Russian, but also helps with Danish and Swedish. She enjoys browsing through Scandinavian media websites –they are so different and fascinating.

Maria Huntova1

The analysts don’t seem swept by the rising tide of information. They are relaxed, patient and competent in what they are doing. The entries must be relevant, the data sets sound, errors are costly. The job requires concentration, long searches and knowledge about the industry, says Mariya Hantova, who has been on the media contacts database team for four years and works with Spanish, Catalan, English and sometimes with Portuguese.

Maintaining a media contact database doesn’t require checks in police and court records or terrorist lists. Compiling and updating a directory of journalists with their beats, media outlets, and contact details involves browsing online media, Linkedin, Facebook. The project is a favourite with most analysts.

BorislavAnkov2

The directory helps PR professionals pitch a story to a journalist with the right competence, location and publisher (print, broadcast and online), project manager Borislav Ankov explains. Journos, on the other hand, get leads, contracts, commissions. See Borislav’s post “How database management moves you ahead with your media content”.

Analyst’s little helpers and other tricks

Compiling databases would be very tedious without the analysts’ little helpers – the crawlers. They do the manual work – scrape lists and tables from websites and slot the info into ordered tables. The software script has been developed by our IT department. “Most of all I enjoy the automation. It helps us a lot with structured and semi-structured sources, increases productivity and improves quality,” Ivaylo notes.

“We wouldn’t be able to manage without automated searches, given the growing volumes of work and the diversification in services”, Tsveta adds. “The automated inclusion of keywords enables further searches in the web if you can’t find the entity in a certain database.”

One of the latest innovations for the media contacts database team is their own database of sources and media outlets. This helps analysts to optimize work because they don’t need to remember thousands of email templates. “Not that some people-machines like me can’t memorise even greater numbers,” Borislav jokes.

In our next post about the database management team we’ll tell you how rookies grow into fully-fledged analysts and who walks them through the maze of data.