Data Society was delighted by RStudio’s invitation to participate in the February meetup series. Our company’s co-founders, Chief Executive Officer Merav Yuravlivker and Chief Solutions Officer Dmitri Adler, enjoyed sharing their insights into industry trends for the RStudio Finance Meetup.
This subject is dear to Data Society’s team, which has worked extensively on data science solutions and training of particular significance to the financial services industry. Our founders’ presentation highlighted some of the growing challenges that we have helped our clients in financial services navigate with data science and that we believe will continue to demand attention.
Financial crime remains an escalating threat. Failure to effectively monitor transactions, detect potentially fraudulent activity, and report accordingly can be detrimental both commercially and as a compliance concern. Therefore, as new forms of fraud evolve, so must technologies capable of capturing activities that warrant alerts. R and Python are essential tools to facilitate the optimization of transaction monitoring systems that generate tailored alerts based on specified parameters. In addition, they enable data scientists to perform independent analysis of transaction data for anomalies that may indicate fraudulent activity.
Risk assessment is an additional challenge in the financial services industry—which, like fraud, also has compliance implications. Technology offers powerful solutions, like RStudio’s toolchain, to ensure your work is scalable, secure, and auditable. Therefore, data science-based tools that can automate the complex risk evaluation process also demand a thorough understanding of the data and mechanisms that drive the output. For example, we developed a machine learning application in R for Inter-American Development Bank to assess risk associated with infrastructure projects.
However, financial services professionals can be vulnerable to misleading reporting and inaccurate conclusions without proper training. We encourage clients to develop a thorough understanding of the analytical processes and input that produce the output of automated risk assessments, as well as proficiency in the key data science tools used to produce them.
We also recently introduced Camelsback, a financial risk assessment tool based on the award-winning risk evaluation framework we developed in partnership with Google for the FDIC’s Resilience Tech Sprint. Named after the CAMELS (Capital adequacy, Asset quality, Management, Earnings, Liquidity, and Sensitivity) system, which banking regulators use for rating financial institutions, this Python-based AI engine incorporates data from internal and external sources. It enables managers to calibrate risk weights to generate risk scores from a broad base of information and context.
The introduction of external data in tools such as Camelsback points to an additional industry trend, the increasing reliance on data from non-traditional sources to improve outcomes including risk assessment and fraud detection. Applications that leverage documentation such as market reports, news articles, and social media posts introduce new analytical dimensions. R and Python provide extensive libraries for natural language processing, enabling financial institutions to access a wealth of insights from text data through techniques such as sentiment analysis, topic modeling, and text classification. The capacity to gather and process less structured, more quantitative data seems likely only to gain importance in many areas of financial services analytics.
In this climate, data science has become a valuable tool for financial institutions striving to exploit previously untapped reserves of informative data. An additional trend related to the mining of non-traditional data is the movement toward Environmental, Social, and Governance (ESG) disclosure. Although ESG regulations are still nascent in the US, there is momentum behind financial institutions accounting for variables such as ecological impact, workforce practices, and equity when assessing risk and making lending decisions. Like emerging risk analysis and fraud detection processes, measuring ESG requires understanding of the types and sources of information relevant to such outcomes and how to quantify the data to be mined from them.
In a world in constant motion, the financial services industry must be ever-responsive to fresh demands related to monitoring, analytics, assessment, compliance, and reporting. Data science tools and methodologies play critical roles in helping financial institutions access and leverage the data most relevant to these evolving challenges. However, increased reliance on data science solutions should be accompanied by increased investment in data science training to ensure their output’s integrity, accuracy, and interpretability.
Watch Dmitri Adler and Merav Yuravlivker’s webinar, The Shift to Data: Industry Trends in Finance, below:
A team from Green Shield Canada explores Apache Parquet as an option for data processing. With the parquet file format, the team was able to process data 1,500 times faster than with CSVs.