The Future of Fraud Detection

18 Friday Apr 2014

Posted by 1nicholasgarcia in 03_Fraud detection, Uncategorized

The Future of Fraud Detection

By Rohan Nanda, Nicholas Garcia, & Alejandra Caro Rincon

Advances in technology give criminals increasingly powerful tools to commit fraud, especially using credit cards or internet bots. To combat the evolving face of fraud, researchers are developing increasingly sophisticated tools, with algorithms and data structures capable of handling large-scale complex data analysis and storage.

Source: Merchant911[1]

The most popular area of current fraud detection research has been in credit card, but we see online bots and Ad click fraud as growing concerns for the future. With rapid reduction in the cost of computing power, publishers can exploit vulnerabilities by creating bots to click on Ads to generate more revenue.

source: Phua, Clifton, et al. “A comprehensive survey of data mining-based fraud detection research.” arXiv preprint arXiv:1009.6119 (2010).

Credit-card Fraud Detection

Banks typically implement a single fraud detection and prevention system that tries to capture fraudulent transactions based on a model generalized to all their customers. This network model incorporates general fraud trends from different products across the bank. However, this approach is ineffective in the long run as they are too broad to find ever more sophisticated forms of fraud. Credit card associations are combining network as well as custom models to develop a comprehensive system that detects fraud upon point of sale. For instance, MasterCard implements the following approach:

Source: MasterCard Fraud Analytics

With a diverse set of data mining and neural network analysis techniques, and over 100 parameters to evaluate, MasterCard’s Expert Monitoring system aids issuer banks in detecting fraud within minutes of the transaction[2].

Source: MasterCard Fraud Analytics

Custom models or targeted modeling enhance the accuracy of fraud detection by pulling customer-specific data points[2]. In future, this technique will be standardized across all card associations and banks. Nonetheless, this approach is difficult because of customer’s privacy concerns for customer data. Consequently, the challenge the credit companies must master is implementing such a system without spooking the customer. A second challenge is the timeliness of the detection. Customers want their transactions approved in seconds, not minutes. To address this issue, better machine learning algorithms are needed to raise flags about fraudulent transaction in real-time. Standardized techniques are desirable across industries, however they must account for user heterogeneity and security preferences, and models have to bed constantly update in order to detect and learn emerging fraudulent behaviors.

Other challenges in fraud detection systems include but are not limited to:

Imbalanced data distribution: the number of fraudulent transactions is much smaller than legitimate ones. History has shown that models trained on such data do not perform well, however bootstrapping and other resampling techniques are used to counter this in order to ‘con’ the model into thinking that it has more data to work with[3]
Non-stationary data: with a continuous stream of transactions available, models have to be retrained often. However, this problem is compounded with imbalanced class distributions[3]
Non-availability of public data: Due to the sensitive nature of the topic, often datasets are not available to effectively evaluate existing methods of fraud detection[3]

Online Ad Click Fraud Detection

In most of our entries we have been very interested in fraud detection in the financial industry. In this entry we also want to mention alternative emerging fraud behaviors; also how they harm some business and the strategies used in the industry to detect it.

Online click fraud is the act of clicking on advertisements without a specific interest on the product. Such practice is usually performed by software in a systematic way, increasing the marketing expenses for the business offering the product. This also harms the credibility of the advertising companies and the online advertising industry as a whole. Click forensics estimate that fraud clicks correspond to a 19% of overall clicks through ads. [4]

In order to identify fraudulent clicks there are several machine learning techniques being developed. For instance, detecting duplicate clicks over decaying windows is an important technique to accomplish such task. These type of models consist of eliminating the expired information according to the number of object collected or the activity in a certain period of time, over which the analysis is performed. Some of the most common algorithms implemented are based on Bloom filters, a data structure for testing whether an element belongs to a set. The particular characteristic of this approach is that these probabilistic data structures don’t allow false negatives. Thus avoiding classify a set of fraudulent clicks as legit.

Yet, beyond the technical approach of this problem it is important to note the important role and the challenges regulation around the world. The heterogeneity across regulatory frameworks in different countries poses great challenges for many industries to detect fraud. For instance, in countries where Electronic privacy laws are too strict it is harder to gather data, detect fraudulent patterns, and thus track and identify fraudsters. To learn more about the specific tools that are in the process of being implemented to combat fraud, please see our Overview of the Industry blog post here.

Works Cited

[1]Merchant911. (n.d.). Credit Card Fraud Trends. Retrieved April 17, 2014, from http://www.merchant911.org/fraud-trends.html

[2]MasterCard. (n.d.). How the Past Changes the Future of Fraud. Retrieved April 17, 2014, from http://www.mastercard.com/us/company/en/docs/Modeling_white_paper.pdf

[3]Pozzolo, A. D. (n.d.). Learned lessons in credit card fraud detection from a practitioner perspective. Learned lessons in credit card fraud detection from a practitioner perspective. Retrieved April 18, 2014, from http://www.sciencedirect.com/science/article/pii/S095741741400089X

[4] http://searchengineland.com/click-fraud-q42010-62471

Gallery

Technologies for fraud detection

11 Friday Apr 2014

Posted by acarorin in 03_Fraud detection

≈ Leave a comment

This gallery contains 1 photo.

By: Rohan Nanda, Nicholas Garcia and Alejandra Caro In our previous blog we noted that fraud detection tools implemented are …

Continue reading →

Fraud detection in the real world

04 Friday Apr 2014

Posted by 1nicholasgarcia in 03_Fraud detection, Uncategorized

≈ Leave a comment

Tags

Fraud Detection

Introduction – The Fraud Analysis Landscape

Using data analytics for fraud detection is not just a fad. With computing power prices falling, companies can access scalable real-time analytics solutions at a reasonable cost. [1]

However, in general, most eCommerce companies (a.k.a. merchants) lack the expertise to implement sophisticated solutions. To address this, several companies (analytics vendors) provide fraud detection services while some large companies build their own solutions internally.

The network effect helps these data vendors catch fraud across a large set of merchants. For example, if 1000 merchants contract with a vendor, the 1001th merchant has the great benefit of knowing if one of their customers was implicated in fraud schemes elsewhere. Data vendors also do fraud scoring, and may claim to have “machine learning” algorithms. Despite this, many vendors have very simple models behind their scores. Typically, they specialize in analyzing one area – device IDs, IPs, phone numbers, etc. Others will actually take ALL the merchant’s data and outsource the entire machine learning process to a third party. They’ll build models on your data, combine it with other merchant’s data, and give you back a customized risk score.

Some big companies don’t want their data helping competitors, so they keep their data in-house. These larger companies are more likely to employ sophisticated algorithms to detect fraud. These can include neural networks, classification, random forests, logistic regression, unsupervised clustering, etc. Link analysis and graph theory is also extremely useful in identifying fraudsters.

The biggest challenge for fraud detection is to build models that return results in less than one second. Data needs to be accounted for and all models should be ready ahead of time. Validating models on large data sets can also be time intensive, which requires parallel computing. Because of these problems, analytics can be a 5-10 person job, which for a merchant of less than 200 people is a lot to dedicate to risk analysis. [2]

Notable fraud detection companies:

ThreatMetrix

ThreatMetrix identifies users’ devices by MAC address and build models to assess risk. [3] They are the fastest growing context-based security provider. [4] Their models track and evaluate user behavior to assess deviation from the norm and use classifiers to compare current transactions to past accept/reject/review outcomes of related transactions. [5] It leverages data across their entire customer base to build a more generalizable model.

One of their principal strengths is “device fingerprinting” that tracks info from the device and browser session. This gets around the problem of fraudsters hiding behind proxies to conceal their true location.

Similarly this technology can help people whose computers are being unknowingly used as part of a swarm of ‘botnets’, although it is not obvious if ThreatMetrix is pursuing botnets. [6]

ReD (Retail Decisions)

ReD uses machine learning and graph analysis to detect fraud. They specialize in assessing many small transactions for the risk of fraud automatically. [3] Their service combines neural networks with customizable association rules. They rely on third party services for client device identification, blacklists, and accessing public records. [7]

Guardian Analytics

This company specializes in wire-fraud for banks. Their risk engine dynamically adapts to user behavior to detect new fraud attacks. Bank’s internal data sets are used for their algorithms.

Behavioral analytics and anomaly detection is used for fraud detection. When an account is compromised a fraudster’s activity often deviates from a normal user’s behavior. As the number of individual anomalous actions (e.g. suspicious login activities, account reconnaissance, adding users, and suspicious transactions) accumulate, an alarm can be raised. By proactively detecting a fraud, response could take place even before problems arise. Anomaly detection sorts accounts by risky activity. [8]

Visa

Visa stores examples of valid purchase transactions to train their detection models. Each time an authorization request is processed it is compared against an individuals transaction history. When changes in typical spending patterns are detected, such as change in billing address, large purchase, or change in personal data (e.g. SSN), Visa increases a transaction’s potential risk and notifies the financial institution. [9] Visa Europe uses mobile phone location as an attribute as well in a partnership with ValidSoft Limited.

Vindicia

Runs an integrated billing solution for digital retail. It identifies the most profitable uses to extend their lifetime value. They aim to avoid involuntary churn by customer payment failures. [10]

“False positives” — where businesses incorrectly refuse a valid transaction — are critical to your online economics. When the cost of goods sold is almost zero and margins are high, false positives cost digital businesses much more than an individual fraudulent transaction.

Risky transactions are either fulfilled after they are paid, or a company can be alerted and decide for itself whether to ask for more information or reject the transaction.

Features that Vindicia uses include:

Distance from IP address geolocation to billing address
Whether or not the user is behind a proxy
Whether the bank and billing address are in the same country
Checking if an email comes from a free email provider or not

These features are compared against a database of previous chargebacks processed by Vindicia.

Honorable Mentions

MaxMind – An IP database to identify where users are paying from

Accertify (bought by American Express) – Similar to other companies

iOvation – Device identification, proxy piercing

Bluecava

Kount

Ethoca

Telesign

References

[1] Ruotolo, James. “Big Data for Fraud Detection.” Insurance and Technology. Insurance and Technology, 16 May 2013. Web. 04 Apr. 2014. http://www.insurancetech.com/claims/big-data-for-fraud-detection/240155020

[2] Philip McCanna, Boku.com

[3] “What Are the Leading Fraud and Risk Management Companies?” Quora. N.p., n.d. Web. 04 Apr. 2014. http://www.quora.com/What-are-the-leading-fraud-and-risk-management-companies

[4] “ThreatMetrix Enters Online Gaming Market to Protect Casinos and Consumers from Cybercrime.” PRWeb. N.p., 04 Mar. 2014. Web. 04 Apr. 2014. http://www.prweb.com/releases/2014/03/prweb11636100.htm

[5] “Persona ID.” ThreatMetrix. N.p., n.d. Web. 04 Apr. 2014 http://www.threatmetrix.com/technology/persona-identification/

[6] Device fingerprinting defends against online fraud. Networkworld.com (2009-04-20). Retrieved on 2013-08-16.

[7] “Fraud Prevention and Payment Processing.” Fraud Prevention and Payment Processing Solutions from ReD. N.p., n.d. Web. 04 Apr. 2014. http://www.redworldwide.com/

[8] “Online Banking Security Research & Resources – Guardian Analytics.” Online Banking Security Research & Resources – Guardian Analytics. N.p., n.d. Web. 04 Apr. 2014. http://www.guardiananalytics.com/researchandresources/anomaly-detection-infographic-video.php

[9] “Fraud Monitoring.” Digital Payments for Individuals, Businesses & Governments. N.p., n.d. Web. 04 Apr. 2014. http://usa.visa.com/personal/security/fraud-monitoring.jsp

[10] “The True Leader in Enterprise-Class Subscription Billing.” Vindicia. N.p., n.d. Web. 04 Apr. 2014. http://www.vindicia.com/, http://www.vindicia.com/wp-content/uploads/resources-data-sheet-fraud-screening.pdf

Introduction to Fraud Detection

27 Thursday Mar 2014

Posted by rohannanda2014 in 03_Fraud detection

≈ 1 Comment

Tags

Introduction

Fraud is a deliberate deception in order to secure unfair or unlawful gain. It is one of the most detrimental factors to the progress of almost one-third of the companies, such as banks, insurance and retail firms¹, worldwide. Additionally, while the U.S processed just 24% of the credit card transactions worldwide, it accounted for 47% of global fraud². This is the reason why credit analysts at banks monitor systems that flag transaction alerts. To detect/combat fraud, for years scientists and researchers have been developing sophisticated systems that implement cutting-edge algorithms. It is important to know the role of data science in this process.

Source: http://www.pwc.com/gx/en/economic-crime-survey

How is Fraud Detected?
Fraud detection relies heavily on the detection of anomalies and interesting patterns that deviate from historical data. A variety of supervised and unsupervised learning techniques have been implemented to assess the likelihood of fraudulent events by segmenting and classifying data. Over the course of the next few weeks, we will delve further into organizations that actively detect fraud, algorithms that are widely used in the industry, challenges and limitations in this space, and the future of this interesting application of data science.

¹Global Economic Crime 2014 Survey. (n.d.). PwC. Retrieved March 24, 2014, from http://www.pwc.com/gx/en/economic-crime-survey
²Brustein, J. (2013, December 23). Why the U.S. Leaves Its Credit-Card System Vulnerable to Fraud. Bloomberg Business Week. Retrieved March 24, 2014, from http://www.businessweek.com/articles/2013-12-23/why-the-u-dot-s-dot-leaves-its-credit-card-system-vulnerable-to-fraud

datascienceCMU

~ Learn,Explore,Network on Data Science

Category Archives: 03_Fraud detection

The Future of Fraud Detection

The Future of Fraud Detection

Technologies for fraud detection

Fraud detection in the real world

Introduction – The Fraud Analysis Landscape

ReD (Retail Decisions)

Honorable Mentions

Introduction to Fraud Detection