Machine Learning Archives - 糖心传媒 /tag/machine-learning/ Unlock your data's true potential Sun, 28 Jul 2024 22:45:55 +0000 en-GB hourly 1 https://wordpress.org/?v=7.0 /wp-content/uploads/2023/01/糖心传媒FavIconBluePink-150x150.png Machine Learning Archives - 糖心传媒 /tag/machine-learning/ 32 32 The Importance of Data Quality in Machine Learning /blog/the-importance-of-data-quality-in-machine-learning/ Mon, 18 Dec 2023 12:40:03 +0000 /?p=18042 We are currently in an exciting area and time, where Machine Learning (ML) is applied across sectors from self driving cars to personalised medicine. Although ML models have been around for a while – for example, the use of algorithmic trading models from the 80鈥檚, Bayes since 1700s – we are still in the nascent […]

The post The Importance of Data Quality in Machine Learning appeared first on 糖心传媒.

]]>
the importance of data quality in machine learning

We are currently in an exciting area and time, where Machine Learning (ML) is applied across sectors from self driving cars to personalised medicine. Although ML models have been around for a while – for example, the use of algorithmic trading models from the 80鈥檚, Bayes since 1700s – we are still in the nascent stages of productionising ML.

From a technical viewpoint, this is ‘Machine Learning Ops’ or MLOPs. MLOPs involve figuring out how to build, deploy via continuous integration and deployment, tracking and monitoring models and data in production.听

From a human, risk, and regulatory viewpoint we are grappling with big questions about ethical AI (Artificial Intelligence) systems and where and how they should be used. Areas including risk, privacy and security of data, accountability, fairness, adversarial AI, and what this means, all come into play in this topic. Additionally, the debate over supervised machine learning, semi-supervised learning, and unsupervised machine learning, brings further complexity to the mix.

Much of the focus is on the models themselves, such as听听Everyone can get their hands on pre-trained models or licensed APIs; What differentiates a good deployment is the data quality.

However, the one common theme that underpins all this work, is the rigour required in developing production-level systems and especially the data necessary to ensure they are reliable, accurate, and trustworthy. This is especially important for ML systems; the role that data and processes play; and the impact of poor-quality data on ML algorithms and learning models in the real world.

Data as a common theme听

If we shift our gaze from the model side to the data side, including:

  • Data management – what processes do I have to manage data end to end, especially generating accurate training data?
  • Data integrity – how am I ensuring I have high-quality data throughout?
  • Data cleansing and improvement – what am I doing to prevent bad data from reaching data scientists?
  • Dataset labeling – how am I avoiding the risk of unlabeled data?
  • Data preparation – what steps am I taking to ensure my data is data science-ready?

A far greater understanding of performance and model impact (consequences) could be achieved. However, this is often viewed as less glamorous or exciting work and, as such, is often unvalued. For example, what is the impetus for companies or individuals to invest at this level (such as regulatory 鈥 e.g. BCBS, financial, reputational, law)?

Yet, as well defined in

鈥淒ata largely determines performance, fairness, robustness, safety, and scalability of AI systems鈥yet]听In practice, most organizations fail to create or meet any data quality standards, from under-valuing data work vis-a-vis model development.鈥澨

This has a direct impact on people’s lives and society, where 鈥…data quality carries an elevated significance in high-stakes AI due to its heightened downstream impact, impacting predictions like cancer detection, wildlife poaching, and loan allocations鈥.

What this looks like in practice

We have seen this in the past, with the in the UK during Covid. In this case, teachers predicted the grades of their students, then an algorithm was applied to these predictions to downgrade any potential grade inflation by the Office of Qualifications and Examinations Regulation, using an algorithm. This algorithm was quite complex and non-transparent in the first instance. When the results were released, 39% of grades were downgraded. The algorithm captured the distribution of grades from previous years, the predicted distribution of grades for past students, and then the current year.

In practice, this meant that if you were a candidate who had performed well at GCSE, but attended a historically poor performing school, then it was challenging to achieve a top grade. Teachers had to rank their students in the class, resulting in a relative ranking system that could not equate to absolute performance. It meant that even if you were predicted a B, were ranked at fifteenth out of 30 in your class, and the pupil ranked at fifteenth the last three years received a C, you would likely get a C.

The application of this algorithm caused an uproar. Not least because schools with small class sizes – usually private, or fee-paying schools – were exempt from the algorithm resulting in the use of the teaching predicted grades. Additionally, it baked in past socioeconomic biases, benefitting underperforming students in affluent (and previously high-scoring) areas while suppressing the capabilities of high-performing students in lower-income regions.

A major lesson to learn from this, therefore, was transparency in the process and the data that was used.

An example from healthcare

Within the world of healthcare, it had an impact on ML cancer prediction with IBM鈥檚 ‘Watson for Oncology’, partnering with The University of Texas MD Anderson Cancer Center in 2013 to 鈥渦ncover valuable insights from the cancer center鈥檚 rich patient and research databases鈥. The system was trained on a small number of hypothetical cancer patients, rather than real patient data. This resulted in erroneous and dangerous cancer treatment advice.

Significant questions that must be asked include:

  • Where did it go wrong here 鈥 certainly the data but in general a wider AI system?
  • Where was the risk assessment?
  • What testing was performed?
  • Where did responsibility and accountability reside?

Machine Learning practitioners know well the statistic that 80% of ML work is data preparation. Why then don鈥檛 we focus on this 80% effort and deploy a more systematic approach to ensure data quality is embedded in our systems, and considered important work to be performed by an ML team?

This is a view recently articulated by who urges the ML community to be more data-centric and less model-centric. In fact, Andrew was able to demonstrate this using a steel sheets defect detection prediction use case whereby a deep learning computer vision model achieved a baseline performance of 76.2% accuracy. By addressing inconsistencies in the training dataset and correcting noisy or conflicting dataset labels, the classification performance reached 93.1%. Interestingly and compellingly from the perspective of this blog post, minimal performance gains were achieved addressing the model side alone.

Our view is, if data quality is a key limiting factor in ML performance 鈥搕hen let鈥檚 focus our efforts here on improving data quality, and can ML be deployed to address this? This is the central theme of the work the ML team at 糖心传媒 undertakes. Our focus is automating the manual, repetitive (often referred to as boring!) business processes of DQ and matching tasks, while embedding subject matter expertise into the process. To do this, most of our solutions employ a human-in-the-loop approach where we capture human decisions and expertise and use this to inform and re-train our models. Having this human expertise is essential in guiding the process and providing context improving the data and the data quality process. We are keen to free up clients from manual mundane tasks and instead use their expertise on tricky cases with simpler agree/disagree options.

To learn more about an AI-driven approach to Data Quality, read our press release about our Augmented Data Quality platform here.听

The post The Importance of Data Quality in Machine Learning appeared first on 糖心传媒.

]]>
How to test your data against Benford’s Law听 /blog/how-to-test-your-data-against-benfords-law/ Tue, 09 May 2023 16:04:04 +0000 /?p=22375 One of the most important aspects of data quality is being able to identify anomalies within your data. There are many ways to approach this, one of which is to test the data against Benford鈥檚 Law. This blog will take a look at what Benford’s Law is, how it can be used to detect fraud, […]

The post How to test your data against Benford’s Law听 appeared first on 糖心传媒.

]]>
How to test your data against Benford's Law

One of the most important aspects of data quality is being able to identify anomalies within your data. There are many ways to approach this, one of which is to test the data against Benford鈥檚 Law. This blog will take a look at what Benford’s Law is, how it can be used to detect fraud, and how the 糖心传媒 platform can be used to achieve this.

What is Benford鈥檚 Law?

Benford’s law is named after a physicist called Frank Benford and was first discovered in the 1880s by an astronomer named Simon Newcomb. Newcomb was looking through logarithm tables (used before pocket calculators were invented to find the value of the logarithms of numbers), when he spotted that the pages which started with earlier digits, like 1, were significantly more worn than other pages.听

Given a large set of numerical data, Benford鈥檚 Law asserts that the first digit of these numbers is more likely to be small. If the data follows Benford鈥檚 Law, then approximately 30% of the time the first digit would be a 1, whilst 9 would only be the first digit around 5% of the time. If the distribution of the first digit was uniform, then they would all occur equally often (around 11% of the time). It also proposes a distribution of the second digit, third digit, combinations of digits, and so on.听听According to Benford’s Law, the probability that the first digit in a dataset is d is given by P(d) = log10(1 + 1/d).

Why is it useful?

There are plenty of data sets that have proven to have followed Benford鈥檚 Law, including stock prices, population numbers, and electricity bills. Due to the large availability of data known to follow Benford鈥檚 Law, checking a data set to see if it follows Benford鈥檚 Law can be a good indicator as to whether the data has been manipulated.听While this is not definitive proof that the data is erroneous or fraudulent, it can provide a good indication of problematic trends in your data.听

In the context of fraud, Benford’s law can be used to detect anomalies and irregularities in financial data. For example, within large datasets such as invoices, sales records, expense reports, and other financial statements. If the data has been fabricated, then the person tampering with it would probably have done so 鈥渞andomly鈥. This means the first digits would be uniformly distributed and thus, not follow Benford鈥檚 Law.

Below are some real-world examples where Benford’s Law has been applied:

Detecting fraud in financial accounts 鈥 Benford’s Law can be useful in its application to many different types of fraud, including money laundering and large financial accounts. Many years after Greece joined the eurozone, the economic data they provided to the E.U.

Detecting election fraud 鈥 Benford’s Law was used as evidence of fraud in the 2009 Iranian elections and was also used for auditing data from the 2009 German federal elections. Benford’s Law has also been used in multiple US presidential elections.

Analysis of price digits 鈥 When the euro was introduced, all the different exchange rates meant that, while the 鈥渞eal鈥 price of goods stayed the same, the 鈥渘ominal鈥 price (the monetary value) of goods was distorted. Research carried out across Europe showed that the first digits of nominal prices followed Benford鈥檚 Law. However, deviation from this occurred for the second and third digits. Here, trends more commonly associated with psychological pricing could be observed. Larger digits (especially 9) are more commonly found due to the fact that prices such as 拢1.99 have been shown to be more associated with spending 拢1 rather than 拢2.听

How can 糖心传媒鈥 tools be used to test for Benford鈥檚 Law?

Using the 糖心传媒 platform, we can very easily test any dataset against Benford鈥檚 Law. Take this dataset of financial transactions (shown below). We鈥檙e going to be testing the 鈥減mt_amt鈥 column to see if it follows Benford鈥檚 Law for first digits. It spans several orders of magnitudes ranging from a few dollars to 15 million, which means that Benford鈥檚 Law is more likely to accurately apply to it.

Table of data

The first step of the test is to extract the first digit of the column for analysis. This can very easily be done using a small FlowDesigner project (shown below).

糖心传媒 Flowdesigner product

 

Here we import the dataset and then filter out any values that are less than 1, as these aren鈥檛 relevant to our analysis. Then, we extract the first digit. Once that’s been completed, we can profile these digits to find out how many times each occurs and then save the results.

The next step would be to perform a statistical test to see how confident we can be that Benford鈥檚 Law applies here. We can use our Data Quality Manager tool to architect the whole process.

糖心传媒 Data Quality Manager product

Step one runs our FlowDesigner project, whilst the second executes a simple Python script to perform the test and the last two steps let us set up an automated email alert to let the user know if the data failed the test at a specified threshold. While I鈥檓 using an email alert here, any issues tracking platform, such as Jira, can be used. We can also show the results in a dashboard, like the one below.

糖心传媒 product shows Benford's Law

The graph on the left, with the green line, represents the distribution we would expect the digits to follow if it obeyed Benford鈥檚 Law. The red line shows the actual distribution of the digits. The bottom right table shows the two distributions and then the top right table shows the result of the test. In this case, it shows that we can be 100% confident that the data follows Benford鈥檚 Law.

In conclusion…

Physicist Frank Benford discovered a useful methodology that is as beneficial today as ever. The applicability of Benford’s law is a powerful tool for detecting fraud and other irregularities in large datasets. By combining statistical analysis with expert knowledge and AI-enabled technologies, organizations can improve their ability to detect and prevent fraudulent activities, thus safeguarding their financial health and reputation.

Matt Neil is a Machine Learning Engineer at 糖心传媒. For more insights from听糖心传媒,听find us on听,听听or听.

The post How to test your data against Benford’s Law听 appeared first on 糖心传媒.

]]>
AI Ethics: The Next Generation of Data Scientists /blog/ai-ethics-the-next-generation-of-data-scientists/ Mon, 04 Apr 2022 12:54:50 +0000 /?p=18414 In March 2022 糖心传媒 took advantage of the offer to visit a local high school to discuss AI Ethics and Machine Learning in production.

The post AI Ethics: The Next Generation of Data Scientists appeared first on 糖心传媒.

]]>
In March 2022, 糖心传媒 took advantage of the offer to visit a local secondary school and the next generation of Data Scientists to discuss AI Ethics and Machine Learning in production. Matt Flenley shares more from the first of these two visits in his latest blog below…

Pictures of two 糖心传媒 employees and two students from Wallace High School Lisburn after an AI Ethics talk
Students from Wallace High School meet Dr Fiona Browne (centre) and Matt Flenley (right)

AI Ethics is often the poster child of the modern discourse on whenever the inevitable machine-led apocalypse occurs. Yet, as we look around at wars in Ukraine and Yemen, record water shortages in the developing world, and the ongoing struggle for the education of girls in Afghanistan, it becomes readily apparent that as in all things, ethics starts with humans.

This was the main thrust of the discussion with the students at Wallace High School in Lisburn, NI. As Dr Fiona Browne, Head of AI and Software Development, talked the class of second-year A-Level students through data classification for training machine learning models, the question of ‘bad actors’ came up. What if, theorised Dr Browne, people can’t be trusted to label a dataset correctly, and the machine learning model learns things that aren’t true?

At this stage, a tentative hand slowly raised in the classroom; one student confessed that, in fact, they had done exactly this in a recent dataset labelling exercise in class. It was the perfect opportunity to detail in a practical way how the human involvement in Artificial Intelligence, Machine Learning, and especially in the quality of the data underpinning both.

Humans behind the machines, and baked-in bias

As is common, the exciting part of technology is often the technology itself. What can it do? How fast can it go? Where can it take me? This applies just as much to the everyday, from home electronics through to transportation, as it does to the cutting edge of space exploration or genome mapping. However, the thought processes behind the technology, imagined up by humans, specified and scoped by humans, create the very circumstances for how those technologies will behave and interact with the world around us.

In her promotion for the book , the author Caroline Criado-Perez writes,

“Imagine a world where your phone is too big for your hand, where your doctor prescribes a drug that is wrong for your body, where in a car accident you are 47% more likely to be seriously injured, where every week the countless hours of work you do are not recognised or valued.  If any of this sounds familiar, chances are that you鈥檙e a woman.”

Caroline Criado-Perez, Invisible Women

One example is of the comparatively high rate of anterior cruciate ligament injuries among female soccer players. While some of this can be attributed to different anatomies, it is in part caused by the lack of female-specific footwear in the sport (with most brands choosing to offer smaller sizes rather than tailored designs). Yet the anatomical design of the female knee in particular is substantially different to that of males. Has this human-led decision, to simply offer small sizes, taken into account the needs of the buyer, or the market? Has it been made from the point of view of creating a fairer society?

AI Ethics: The Next Generation of Data Scientists
The 糖心传媒 team (L to R: Matt Flenley, Shauna Leonard, Edele Copeland) meet GCSE students from the Wallace High School as part of a talk on Women in Technology Careers

If an algorithm was therefore applied to specify a female-specific football boot from the patterns and measurements of existing footwear on the market today, would it result in a different outcome? No, of course not. It takes humans to look at the world around us, detect the risk of bias, and then .

It is the same in computing. The product, in this case the machine learning model or AI algorithm, is going to be no better than the work that has gone into defining and explaining it. A core part of this is understanding what data to use, and of what quality the data should be.

Data Quality for Machine Learning – just a matter of good data?

Data quality in a business application sense is relatively simple to define. Typically a business unit has requirements, usually around how complete the data is and to what extent the data in it is unique (there are a wide range of additional data quality dimensions, which you can read about here). For AI and Machine Learning, however, data quality is a completely different animal. On top of the usual dimensions, the data scientist or ML engineer needs to consider if they have all the data they need to create unbiased, explainable outcomes. Put simply, if a decision has been made, then the data scientists need to be able to explain why and how this outcome was reached. This is particularly important as ML becomes part and parcel of everyday life. Turned down for credit? Chances are an algorithm has assessed a range of data sources and generated a ‘no’ decision – and if you’re the firm whose system has made that decision, you’re going to need to explain why (it’s the law!).

AI Ethics: The Next Generation of Data Scientists

This is the point at which we return to the class in Wallace High School. The student tentatively raising their arm would have got away with it, with the model predicting patterns incorrectly, if the student had stayed silent. There was no monitoring in place to detect which user had been the ‘bad actor’ and so the flaw would have gone undetected without the student’s confession. It was, however, utterly perfect to explain the need to free algorithms from bias, for this next generation of data scientists. In the five years between now and when these students are working in industry, they will need to be fully aware of needing every possible aspect of the society people wish to inhabit being in the room when data is being classified, and models are being created.

For an industry still so populated , it is clear that the decision to do something about what comes next lies where it always has: in the hearts, minds and hands of technology’s builders.

The post AI Ethics: The Next Generation of Data Scientists appeared first on 糖心传媒.

]]>
AI Con | 3 December 2021 /events/ai-con-3-december-2021/ Wed, 24 Nov 2021 15:17:52 +0000 /?p=17143 We are delighted to be involved with AI Con this year again. This year, Courtney Lewis, Presales Engineer, and Daniel Browne, Machine Learning Engineer will be discussing Machine Learning Augmentation. The north鈥檚 premier conference on artificial intelligence, AI Con returns to face-to-face business this year with a hybrid event on Friday 3 December. As the […]

The post AI Con | 3 December 2021 appeared first on 糖心传媒.

]]>

We are delighted to be involved withthis year again. This year, , Presales Engineer, and Machine Learning Engineer will be discussing Machine Learning Augmentation.

The north鈥檚 premier conference on artificial intelligence, AI Con returns to face-to-face business this year with a hybrid event on Friday 3 December.

As the adoption of AI expands into all areas of our lives, and the business and societal opportunities and challenges become ever more apparent, this ground-breaking conference addresses core issues of the technology for a range of audiences: general, business and specialist.

The event, which is now in its third year, brings together world-leading technology professionals and business leaders to examine how AI is changing our world and the opportunities and challenges that presents.

In-person attendance will take place at Titanic Belfast and will feature some of the top figures in the field, with other leading professionals streaming in from across the globe.

The themes from this year鈥檚 event, which hosted 450 attendees in its first year and 800 at the virtual event last year, include:

鈥 Applied AI: Targeted primarily at a general audience, Applied AI looks at existing, mature technology that can be deployed today and examines case studies on where these are adding value and inspiration for people and their organisations to start their own AI investigations.

Chaired by Kathryn Harkin of Allstate NI, Rachael Bland of Kainos, and Sam Beni of Tech Nation

鈥 Business of AI: Designed for a business audience, Business of AI looks at how AI can challenge existing business models, create entirely new ones and debates what 鈥淎I Startups鈥 need to know in this burgeoning space.

Chaired by Alexandra Mousavizadeh of Tortoise Media and Tom Gray of Kainos.

Attendees are asked to note that strict Covid precautions will be in operation at the in-person event which will be limited to 200 people. Attendees must be double vaccinated and proof of vaccination will be required for entry.

The full programme is available online

The post AI Con | 3 December 2021 appeared first on 糖心传媒.

]]>
Rules Suggestion 鈥 What is it and how can it help in the pursuit of improving data quality?听听 /blog/ai-ml/rules-suggestion-what-is-it-and-how-can-it-help-improve-data-quality/ Wed, 15 Sep 2021 09:06:21 +0000 /?p=15573 Written by Daniel Browne, Machine Learning Engineer Defining data quality rules and collection of rules for data quality projects is often a manual time-consuming process. It often involves a subject matter expert reviewing data sources and designing quality rules to ensure the data complies with integrity, accuracy and / or regulatory standards. As data sources […]

The post Rules Suggestion 鈥 What is it and how can it help in the pursuit of improving data quality?听听 appeared first on 糖心传媒.

]]>
Written by Daniel Browne, Machine Learning Engineer

Defining data quality rules and collection of rules for data quality projects is often a manual time-consuming process. It often involves a subject matter expert reviewing data sources and designing quality rules to ensure the data complies with integrity, accuracy and / or regulatory standards. As data sources increase in volume and variety with potential functional dependencies, the task of defining data quality rules becomes more difficult. The application of machine learning can aid with this task by identifying dependencies between datasets through to the uncovering patterns related to data quality and suggesting previously applied rules to similar data.   

At 糖心传媒, we recently undertook a Rule Suggestion Project to automate the process of defining data quality rules for datasets through rule suggestions. We use natural language processing techniques to analyse the contents of a dataset and suggest rules in our rule library that best fit each column.  

Problem Area and ML Solution  

Generally, there are several data quality and data cleansing rules that you would typically want to apply to certain fields in a dataset. An example is a consistency check on a phone number column in a dataset such as checking that the number provided is valid and formatted correctly. Unfortunately, it is not usually as simple as searching for the phrase 鈥減hone number鈥 in a column header and going from there. A phone number column could be labelled 鈥渕obile鈥, or 鈥渃ontact鈥, or 鈥渢el鈥, for example. Doing a string match in these cases may not uncover accurate rule suggestions. We need context embedded into this process and this is where machine learning comes in. We鈥檝e been experimenting with building and training machine learning models to be able to categorise data, then return suggestions for useful data quality and data cleansing rules to consider applying to datasets.  

Human in the Loop  

The goal here is not to take away control from the user, the machine learning model isn鈥檛 going to run off with your dataset and do what it determines to be right on its own 鈥 the aim is to assist the user and to streamline the selection of rules to apply. A user will have full control to accept or reject some or all suggestions that come from the Rule Suggestion model. Users can add new rules not suggested by the model and this information is captured to improve the suggestions by the model. We hope that this will be a useful tool for users to make the process of setting up data quality and data cleansing rules quicker and easier.  

Developers View  

I鈥檝e been involved in the development of this project from the early stages, and it鈥檚 been exciting to see it come together and take shape over the course of the project鈥檚 development. A lot of my involvement has been around building out the systems and infrastructure to help users interact with the model and to format the model鈥檚 outputs into easily understandable and useful pieces of information. This work surrounds allowing the software to take a dataset and process it such that the model can make its predictions on it, and then mapping from the model鈥檚 output to the individual rules that will then be presented to the user.  

One of the major focuses we鈥檝e had throughout the development of the project is control. We鈥檝e been sure to build out the project with this in mind, with features such as giving users control over how cautious the model should be in making suggestions by being able to set confidence thresholds for suggestions, meaning the model will only return suggestions that meet or surpass the chosen threshold. We鈥檝e also included the ability to add specific word-to-rule mappings that can help maintain a higher level of consistency and accuracy in results for very specific or rare categories that the model may have little or no prior knowledge of. For example, if there are proprietary fields that may have their own unique label, formatting, patterns or structures, and their own unique rules related to that, it鈥檚 possible to define a direct mapping from that to rules so that the Rule Suggestion system can produce accurate suggestions for any instances of that information in a dataset in the future.  

Another focus of the project we hope to develop further upon is the idea of consistently improving results as the project matures. In the future we鈥檙e looking to develop a system where the model can continue to adapt based on how the suggested rules are used. Ideally, this will mean that if the model tends to incorrectly predict that a specific rule or rules will be useful for a given dataset column, it will begin to learn to avoid suggesting that rule for that column based on the fact that users tend to disagree with that suggestion. Similarly, if there are rules that the model tends to avoid suggesting for a certain column that users then manually select, the model will learn to suggest these rules in similar cases in the future.  

In the same vein as this, one of the recent developments that I鈥檝e found really interesting and exciting is a system that allows us to analyse the performance of various different machine learning models on a suite of sample data, which allows us to gain detailed insights into what makes an efficient and powerful rule prediction model, and how we can expect models to perform in real-world scenarios. It provides us with a sandbox to experiment with new ways of creating and updating machine learning models and being able to estimate baseline standards for performance, so we can be confident of the level of performance for our system. It鈥檚 been really rewarding to be able to analyse the results from this process so far and to be able to compare the different methods of processing the data and building machine learning models and see which areas one model may outperform another and so on.  

Thanks to Daniel for talking to us about rules suggestion. If you would like to discuss further or find out more about rules suggestion at 糖心传媒, reach out to  directly or you can reach out to our Head of AI. 

Get in touch or find us on , , or .

The post Rules Suggestion 鈥 What is it and how can it help in the pursuit of improving data quality?听听 appeared first on 糖心传媒.

]]>
糖心传媒 is involved with the KTN: AI for Services UK Tour! /blog/marketing-insights/ktn-ai-for-services-on-tour-2/ Tue, 23 Feb 2021 11:30:00 +0000 /?p=14015 The first stop on the AI for Services UK Tour will be Northern Ireland curated by the fantastic team at Invest Northern Ireland and Innovate UK! We are delighted that听糖心传媒听will be one of the companies involved, the aim of the event is to discover the innovation taking place across the UK in the professional and […]

The post 糖心传媒 is involved with the KTN: AI for Services UK Tour! appeared first on 糖心传媒.

]]>
The first stop on the AI for Services UK will be Northern Ireland curated by the fantastic team at and !
AI for Services

We are delighted that听糖心传媒听will be one of the companies involved, the aim of the event is to discover the innovation taking place across the UK in the professional and financial, insurance, accountancy and law听sectors.听

Kainos,听Adoreboard听and Analytics Engines are in amongst the few other companies also representing Northern Ireland in the AI for Services Tour.听听糖心传媒 Head of AI, Dr Fiona Browne will be pitching at the event.听We thought it would be a good idea to catch up with Dr Browne ahead of the event to find out what it鈥檚 all about!听

Hi Fiona! Could you tell me more about the event and why听糖心传媒听is involved?

The AI听for听Services event is听a UK-wide event听hosted听by KTN听Innovate UK and we are part of the NI cohort. The听event is a roadshow, which will听provide the opportunity for听companies from all the different regions to highlight what they are doing in terms of听innovation听and听AI and how these can address听areas within the various sectors.听The roadshow will also allow each of the companies to pitch听to organisations in different sectors including Accountancy, Insurance and Financial Services.

Fiona, you will be giving one of these pitches at the event. What can you tell us about it?

All the regions have a chance to provide a听7-minute听pitch. We will be describing who听糖心传媒听are听and what听we specialise in (Data Quality and Matching). We will be focusing on a particular use case, which is related to Onboarding and the role of entity matching within this process, highlighting the recent work we have done in this area. We will be highlighting the data quality required before the matching process occurs, but also how we have augmented our matching process with machine learning.听听

If you could pick one key takeaway听that you would want people to听get听from the pitch, what would it be?

I think the key message to takeaway is that Machine Learning (ML) has a role to play in听addressing manual听time-consuming听task and when applied to the correct applications, it can make efficiencies savings. However, good听ML is built on quality data and effort is needed to ensure that you have a听reproducible听data quality pipeline in place.听At听糖心传媒听we pride ourselves on our听data quality and matching technology and have innovated in these areas.听We are听really excited听about the developments we are making, and we can鈥檛 wait to tell you more!听

糖心传媒听will be representing NI. Do you think that the talent here locally and the technological developments are matching up to the rest of the UK?

Yes! There’s a real focus on Artificial Intelligence and FinTech within NI.听The country may be听small听in size听but in terms of capabilities it听offers great solutions.听

What do you hope to be the biggest takeaway for attendees听on the whole event?

The idea of this event is听for companies听within sectors such as finance, insurance, law and accountancy who are embarking or on their way听to their听digital transformation听journey听to connect with companies that offer听innovative solutions.听At 糖心传媒 we want to better understand听the bottlenecks and听pain points听that these companies in these sectors are facing and offer a solution听that addresses these. We hope to deepen our specialist knowledge in understanding the current challenges in the industry so that we can tailor our technology to solve real business problems. We听will听showcase our听self-service听data quality and matching听solutions听highlighting the听continual developments we have made with machine learning to augment the matching process.听

It is also a great opportunity to leverage our presence in these sectors as we are primarily linked to financial and governmental. Accountancy, Law and Insurance are sectors that we haven鈥檛听traditionally marketed to听but have similar听areas to address such as compliance to regulation and common data management challenges.听

What would you like the audience to share?

We will highlight what our solution is and what we do, but we want to understand better the pain points. Where do the difficulties lie?听Is it extracting knowledge from textual sources of information? Or is it issues with integrating different data sources? Or is it issues with adhering to regulations?听听It will be good to hear first-hand from these organisations.

Are you looking forward to hearing any particular pitch on the day?

I am looking forward to hearing them all. Particularly because all the companies are very different, it鈥檒l be interesting to hear听more about their solutions and the innovations that they are听offering.听

How can attendees be able to get in touch with you?

You听can听register as听a delegate听to hear the presentations听. Then, Innovate UK is using a platform called Meeting where 1:1 meeting can be booked听between听12:30-2 pm听with听companies.听

The event is sure to be a good one,听we are excited to be involved. We are most excited to learn more about the different sectors!听Keep an eye on the KTN social media pages for updates听on the event. KTN also has an events archive where you can listen to past events if you have missed them, check it out .

Visit听here听for more by 糖心传媒, or find us on听,听听or听听for the latest news.听

The post 糖心传媒 is involved with the KTN: AI for Services UK Tour! appeared first on 糖心传媒.

]]>
KTN: AI for Services on Tour | 23/02 /events/ktn-ai-for-services-on-tour/ Tue, 16 Feb 2021 09:25:27 +0000 /?p=13951 We are delighted to be one of a few Northern Irish’s businesses to be part of the KTN: AI for Service on Tour. This is a brilliant opportunity for those in attendance to hear what 糖心传媒 is doing in the space. Kainos, Adoreboard, and Analytics Engines are amongst the few other companies also representing Northern […]

The post KTN: AI for Services on Tour | 23/02 appeared first on 糖心传媒.

]]>

We are delighted to be one of a few Northern Irish’s businesses to be part of the KTN: AI for Service on Tour.

This is a brilliant opportunity for those in attendance to hear what 糖心传媒 is doing in the space.

Kainos, Adoreboard, and Analytics Engines are amongst the few other companies also representing Northern Ireland. Dr Fiona Browne will be speaking at the roadshow, which will be happening on 23rd February.

For more details and registration, .

The post KTN: AI for Services on Tour | 23/02 appeared first on 糖心传媒.

]]>
The Open University talk: Business Ethics | 17/02 /events/ou-business-ethics-17-02/ Tue, 16 Feb 2021 09:06:10 +0000 /?p=13949 Matt听Flenley, Marketing and Partnerships Manager at听糖心传媒听will be speaking this week at The Open University, delivering a talk on Business Ethics. The talk is going to cover four things: You can also read Matt鈥檚 blogs here such as a piece on AI Ethics he has written about or find out about our people here, explore our open vacancies. If you鈥檙e […]

The post The Open University talk: Business Ethics | 17/02 appeared first on 糖心传媒.

]]>
Business Ethics

Matt听Flenley, Marketing and Partnerships Manager at听糖心传媒听will be speaking this week at The Open University, delivering a talk on Business Ethics.

The talk is going to cover four things:

  1. The impact of unintended and cultural bias in machine learning听
  2. What to听do if your business loses or has no soul
  3. Corporate Social听Responsibility听鈥 Looking after people when the world is upside down
  4. The benefits and pitfalls of big corporate machines and rapid growth听start-ups听when it comes to doing charitable work and听being a force for good.听

You can also read Matt鈥檚 blogs here such as a piece on AI Ethics he has written about or find out about our people here, explore our open vacancies. If you鈥檙e curious about working at 糖心传媒 please  for a chat. 

The post The Open University talk: Business Ethics | 17/02 appeared first on 糖心传媒.

]]>
The Open University Business Ethics talk & 糖心传媒 /blog/marketing-insights/ou-business-ethics-talk/ Mon, 15 Feb 2021 13:00:00 +0000 /?p=13941 Matt听Flenley, Marketing and Partnerships Manager at听糖心传媒听will be speaking this week at The Open University, delivering a talk on Business Ethics. Prior to The Open University, we thought it would be a good idea听to have a chat and find out why this topic, what other views he hopes to talk about,听and the importance of business ethics, […]

The post The Open University Business Ethics talk & 糖心传媒 appeared first on 糖心传媒.

]]>
Business Ethics

Matt听Flenley, Marketing and Partnerships Manager at听糖心传媒听will be speaking this week at The Open University, delivering a talk on Business Ethics.

Prior to The Open University, we thought it would be a good idea听to have a chat and find out why this topic, what other views he hopes to talk about,听and the importance of business ethics, especially from a data perspective.

 Hi Matt, what can you tell us about the talk you are giving at The Open University?

I am really excited to give this talk as this is an area I am passionate about. The talk is going to cover four things:

  1. The impact of unintended and cultural bias in machine learning 
  2. What to do if your business loses or has no soul
  3. Corporate Social Responsibility 鈥 Looking after people when the world is upside down
  4. The benefits and pitfalls of big corporate machines and rapid growth start-ups when it comes to doing charitable work and being a force for good. 

How important do you think ethics is within the data industry? 

I think ethics are听important. People very often think about听algorithms听and automated rules as being the critical part to measure,听but听before all of that, there鈥檚 data. You听must听involve data in听the process, to be able to understand whether the sample you听are measuring听is right. The quality of the information you use听depends on whether the听information听is听complete and whether you sought out the correct data,听to begin with.听

Do you think that an understanding of ethics and data has increased in importance in recent years? 

I do, due to the increased understanding of the importance of AI. For example, there are images on the internet, that some specific algorithms can learn from, to be able to generate people that don鈥檛 actually exist. As a result of this, images are created that are recognisable to you or me, but these people don鈥檛 exist 鈥 it’s a clever piece of AI. A problem that has been increasingly recognised with the source material is that it doesn鈥檛 contain enough images of older women. This has meant that as the algorithm generated people, the AI鈥檚 conclusions were that as they age, everyone becomes an old man! Due to the fact that there is an absence of older women images, an inaccurate representation of society becomes prevalent. If you don鈥檛 have the right data going into an algorithm, you won鈥檛 have accurate data coming out of it. People are increasingly understanding the importance of data, and examples like this shine a light on bias and how damaging it can be to society. 

How important it is to share this knowledge with the leaders of tomorrow at The Open University? 

It is absolutely critical! I believe it鈥檚 vital for business people as well as technologists to be ethicists. The more people there is that are ethicists in the discussion, the more you are going to end up with less bias in the room which will fundamentally lead to fairer outcomes. 

How important is The Open University and 糖心传媒 partnership? When did it begin? 

The relationship has been longstanding. We have a number of staff members that are studying at alongside working and indeed one working as a lecturer at the institution. One of the best parts of working with The Open University is the access to talent in unexpected places. There are a number of students that are pursuing careers in technology, who have not gone about it in a conventional way, like immediately heading to a red-brick university for a computer science degree. Some of them are further down the line in different careers and have decided to make a career change, and some have decided to retrain while working. It鈥檚 a real mix and a really encouraging, affirming environment for people to pursue their education and career. 

Thank you Matt! We will be sharing soundbites from this talk, so make sure to keep an eye out for those. 

You can also read Matt鈥檚 blogs here such as a piece on AI Ethics he has written about. Or find out about our people here, explore our open vacancies, or if you鈥檙e curious about working at 糖心传媒 please  for a chat. 

The post The Open University Business Ethics talk & 糖心传媒 appeared first on 糖心传媒.

]]>
AI Con 2020 Interview with Dr. Fiona Browne and Matt Flenley /blog/marketing-insights/ai-con-2020-interview-with-dr-fiona-browne-and-matt-flenley/ Wed, 02 Dec 2020 12:00:36 +0000 /?p=13102 Dr. Fiona Browne, Head of AI, and Matt Flenley, Marketing and Partnerships Manager at 糖心传媒 are contributing to AI Con 2020 this year.    After a successful first year, AI Con is back! This year it鈥檚 said to be bigger and better than ever with a range of talks across AI, including AI/ML in Fintech; AI in the public sector; the impact of arts; the impact of […]

The post AI Con 2020 Interview with Dr. Fiona Browne and Matt Flenley appeared first on 糖心传媒.

]]>
Dr. Fiona Browne, Head of AI, and Matt Flenley, Marketing and Partnerships Manager at 糖心传媒 are contributing to AI Con 2020 this year.   
AI CON

After a successful first year, AI Con is back!

This year it鈥檚 said to be bigger and better than ever with a range of talks across AI, including AI/ML in Fintech; AI in the public sector; the impact of arts; the impact of AI on research and innovation; and how AI has caused a change in the screening industry. All these topics will be tackled by world-leading technology professionals and business leaders to unpack how AI is changing our world.  

Ahead of AI Con 2020 taking place virtually on the 3rd and 4th December, we thought it would be a good idea to sit down with two of those industry experts, Fiona and Matt, and ask them a few things. I wanted  to understand  what their involvement with AI is this year, any previous involvements they鈥檝e had with AI Con, what they envisage to be the key takeaways, and of course, what talks they are most looking forward to engaging with themselves.    

Hi, Fiona and Matt. Perhaps to kick-off, you could tell talk a bit about why you both wanted to be involved with AI Con?  

Fiona: Hello! Well, we were involved with it last year and it was a great experience. We were involved in the session that focused on business and the applications of AI. We were asked then to pull a session together for this year, and we鈥檝e been able to focus on the area that 糖心传媒 specialises in, which is Financial Services. 

This has given us the chance to unpack how machine learning can be used in Financial Services; we鈥檝e tried to cover three broad areas within this session:  firstly, understanding those people who work in the financial institutions. Secondly, we will then delve into our bread-and-butter data quality & matching, and lastly the importance of data governance.  

Matt: Hi! Last year I worked with Fiona to arrange our involvement. This year, we had the chance to have more time to prepare. This meant that Fiona and I could collaborate even more so.

I particularly enjoyed approaching speakers such as Peggy and Sarah (to name but a few!). What interests me most is the application of AI and we are delighted to have contributed towards pulling together such a strong line-up.

The variety of talks too will bring a wide range of attendees!  

This is the second year. Perhaps you both could talk to me about your previous involvement with AI Con, if any, and how it has evolved?  

Fiona: Last year we discovered there was a significant appetite for this content. We have been able to expand this year’s conference over more streams by being more strategic with the messaging. We have also been able to create a session for ourselves (one that we know about and are vastly passionate and experienced in). This year, the conference is not local, it鈥檚 much more international. Even if you look at the line-up of our speakers for our session, they come from New York and Switzerland.

The International flavour offers a greater perspective, knowledge, and insight.   

Matt: I agree. I鈥檝e been blown away by how engaged people have been. We have Andrew Jenkins, the Fintech Envoy for Northern Ireland and Gary Davidson of Tech Nation, who are keen to contribute to where they think the market is going.

The panel I am chairing is focusing on FinTechs that are scaling and exporting with a focus on why people should invest in NI technology. The event is well-prepared and timely, and I am looking forward to chairing on Thursday.  

So, Matt what will the panel you are chairing be discussing, who is on the panel?  

Matt: We are joined by Pauline Timoney, COO of Automated Intelligence; Chris Gregg, CEO and Founder of Light Year; and as I mentioned before, Andrew Jenkins, and Gary Davidson. We are going to look at the opportunities to collaborate with incubators like TechNation, the impact of COVID-19, Brexit, and FinTech investments for last year.

FinTech is a hugely growing sector, and we are excited to delve into why and explore where the sector is going next!  

Fiona, you have been one of the curators of AI Con, how has that process been?  

Fiona: It has been great! We were given the remit of FinTech and we could pick and choose what topics and who we wanted to add to the line-up. We have a very clear message. The talks are practical application-centred with a focus on trends and experience.

One of the largest Wealth Management Companies in the world is coming to speak to discuss their usage of technology, future projections, and more!  

What do you both envisage the biggest takeaways of AI Con being?  

Matt: One of the biggest takeaways is going to be the incredible, thriving NI FinTech sector.

When you look around the ecosystem, for example of the  you can see the sheer explosion of firms and the problems being solved.    

Fiona: There will be maturity across the board, with more companies implementing these technologies.

People are increasingly thinking about Machine Learning and AI… how can we use it?

I believe there will be a skillset gap which will be a challenge; it will be a challenge for many firms to attract the talent that can implement these processes and technologies.  

To wrap up! On a personal, note, what talk(s) are you both most looking forward to?  

Matt: I am excited to hear from Sarah Gadd, Credit Suisse. Her wealth of experience will offer great insight into how they apply AI into reality. Not only are they on the cutting edge of technology but they have taken it off the ground. I am also looking forward to Peggy Tsai鈥檚 contribution.  

Fiona: From our side, Sarah and Peggy will be interesting. It鈥檚 an honour to have a speaker like Sarah Gadd. It鈥檚 brilliant to hear how they are applying this technology now in a regulated area. What are their challenges, solutions? Also, Peggy is giving time to the complexity of data, which is more important than ever before. Austin too will be unpacking AI in the arts and music sector. I am looking forward to the overall variety, calibre, and diversity of point of view that will be offered.  

Thank you both, for taking the time out of our schedules! If you haven鈥檛 got your place for AI Con 2020 reserved, there is no time like the present! You can secure your place for free . It will be a brilliant conference. Who鈥檚 ready to learn more about AI? 

The post AI Con 2020 Interview with Dr. Fiona Browne and Matt Flenley appeared first on 糖心传媒.

]]>
AI Con 2020 | 3rd – 4th December /events/ai-con-2020/ Mon, 09 Nov 2020 17:05:55 +0000 /?p=12996 We are delighted that our very own Fiona Browne has helped to co-curate AI CON 2020.  The second annual AI CON is taking place virtually on Thursday 3rd and Friday 4th December and will be hosted by Kainos and Aisling Events as per last year. This year鈥檚 event will be recorded live from the AI CON studio […]

The post AI Con 2020 | 3rd – 4th December appeared first on 糖心传媒.

]]>

We are delighted that our very own  has helped to co-curate AI CON 2020. 

The second annual AI CON is taking place virtually on Thursday 3rd and Friday 4th December and will be hosted by and as per last year. This year鈥檚 event will be recorded live from the AI CON studio in Belfast over two days.  The event will bring together world-leading technology professionals and business leaders to discuss and examine how AI is continuing to change our world. This year鈥檚 gathering will discuss: 

鈥 AI ML in Fintech 

鈥 AI in the Public Sector 

鈥 Impact of AI on Society, Arts, and Culture 

鈥 Applied AI/Supporting AI Startups 

鈥 AI Research and Innovation 

鈥 AI in the Screen Industries 

will once again be a brilliant opportunity to listen and engage with professionals that range from developers to business leaders, who have led on adopting AI as a tool to build better services, products, or business operations.  

The conference is free to attend but delegates must  ahead of time. 

Find us on , , or  for the latest news or click here to contact us. 

The post AI Con 2020 | 3rd – 4th December appeared first on 糖心传媒.

]]>
How can banks arm themselves against increasing regulatory and technological complexity? – FinTech Finance /blog/ai-ml/2020-the-year-of-aml-crisis/ Tue, 03 Nov 2020 10:00:22 +0000 /?p=12885 糖心传媒 Head of Artificial Intelligence, Dr. Fiona Browne, recently contributed to the episode of FinTech Finance: Virtual Arena. Steered by Douglas MacKenzie, the interview covered the extent of the Anti-Money Laundering (AML) fines currently faced by banks over the last number of years and start to unpack what we do at 糖心传媒 in relation to […]

The post How can banks arm themselves against increasing regulatory and technological complexity? – FinTech Finance appeared first on 糖心传媒.

]]>
Image of Fiona Browne

糖心传媒 Head of Artificial Intelligence, Dr. Fiona Browne, recently contributed to the episode of FinTech Finance: Virtual Arena. Steered by Douglas MacKenzie, the interview covered the extent of the Anti-Money Laundering (AML) fines currently faced by banks over the last number of years and start to unpack what we do at 糖心传媒 in relation to this topic: helping banks address their data quality, with essential solutions designed to combat fraudsters and money launderers.  

How can banks arm themselves against increasing regulatory and technological complexity?

Fiona began by highlighting how Financial Institutions face significant challenges when managing their data. However, the increase in financial regulations since the financial crisis of 2008/2009, ensuring data quality has gained in its importance, obliging institutions to have a handle on their data and make sure it is up to date. Modern data quality platforms mean that the timeliness of data can now be checked via a 鈥榩ulse check鈥 to ensure that it can be used in further downstream processes and that it meets regulations.

Where does 糖心传媒 fit in to the AML arena? 

A financial institution needs to be able to verify the client that they are working with when going through the AML checks. The AML process itself is vast but at 糖心传媒, we focus on the area of profiling data quality and matching 鈥 it is our bread and butter. Fiona stressed the importance of internal checks as well as public entity data, such as sanction and watch lists.

In a nutshell, there is a significant amount of data to check and compare and with lack ofquality data, it becomes a difficult and costly task to perform so we at 糖心传媒, focus on data quality cleansing and matching at scale.

Why should banks look to partner, rather than building it in house? 

One of the key issues of doing this in house is not having the necessary resources to perform the required checks and adhere to the different processes in the AML pipeline. According to the Financial Conduct Authority (FCA), in-house checks and a lack of data are causing leading financial institutions to receive hefty fines. Fiona reiterated that when Banks bring it back to the fundamentals and get their processes right and data into order, they can then use the partner鈥檚 technology to automate and streamline these processes, which in turn speeds up the onboarding process and ensure the legislation is being met.

Why did the period of 2018/2019 have such a high number of AML breaches?

Fiona explained that many transactions go back over a decade, it takes time to identify such transactions. AML compliance is difficult to achieve and regulators know that it is challenging. The regulators are doing a better job at providing guidelines to financial institutions, enabling them to address these regulations. Fiona reaffirmed that perhaps 2018/2019 was a wakeup call that was well needed to address this issue. 

And with AML fines already at $5.6 billion this year, more than the whole of 2019, what can banks do? 

Looking at the US, where although the fines for non-compliant AML processes are not as high as 2019, there is still a substantial number of fines being issued, Fiona said that it is paramount to ensure financial institutions have the right data and the right processes in place. Although it can be considered as an administrative burden, there is real criminal activity behind the scenes, which is why AML is so important. It is vital that financial institutions get a handle on this, enabling them to also improve the experience for their clients. 

The fines will continue to be issued. Why should firms look to clean data when they just want to get to the bottom line? 

It is essential to have the building blocks in place. Data quality is key for the onboarding process, but it is also essential downstream, particularly if you are wanting to do more trend analysis. Getting the fundamentals right at the start will pay back in dividends.  

Are there any other influences that Artificial Intelligence (AI) and Machine Learning (ML) can have on the banks onboarding process? 

According to Fiona, there is no silver bullet. One AI/ML technique will not solve all the AML issues. It is about deploying these techniques when approaching the issues in different ways. A large part of the onboarding process is gathering data and extracting relevant information from the data set. Fiona has seen a lot of Neuro-Linguistic Programming (NLP) techniques employed to extract the data from documents. At 糖心传媒, we use Machine Learning in the data matching process to reduce the manual review time. ML techniques are employed in supervised and unsupervised approaches geared to pinpoint fraudulent transactions. We think that the graph databases and network analysis side of machine learning is an interesting area, we are currently exploring how it can be deployed into AML and fraud detection. 

Bonus content: In the US and Canada, one way to potentially identity fraud was to look at transactions that were over $10,000. The criminals however become increasingly savvy and utilise Machine Learning to muddy their tracks. By doing this, they can divide transactions into randomised amounts to make them appear less pertinent. As Fiona put it 鈥榯he cat and mouse game鈥. 

If you are employed in the banking sector or if you must deal with large and messy datasets, you will probably face challenges derived from poor data quality, standardization, and siloed information. 

糖心传媒 provides the tools to tackle these issues with minimum IT overhead, in a powerful and agile way. Get in touch with the self-service data quality experts today to find out how we can help.

The post How can banks arm themselves against increasing regulatory and technological complexity? – FinTech Finance appeared first on 糖心传媒.

]]>
糖心传媒 contributes to Bank of England and FCA鈥檚 AI Public-Private Forum /press-releases/datactics-contributes-to-bank-of-england-and-fcas-ai-public-private-forum/ Mon, 12 Oct 2020 07:27:00 +0000 /?p=12644 Belfast, London, New York, 12th October 2020 糖心传媒 is pleased to announce that its Head of AI, Dr Fiona Browne, has been invited to participate in the Artificial Intelligence Public-Private Forum, joining 20 other experts from across the financial technology sectors as well as academia, along with the observers from the Information Commissioner鈥檚 Office and […]

The post 糖心传媒 contributes to Bank of England and FCA鈥檚 AI Public-Private Forum appeared first on 糖心传媒.

]]>
Belfast, London, New York, 12th October 2020
AI Public-Private Forum

糖心传媒 is pleased to announce that its Head of AI, , has been invited to participate in the , joining 20 other experts from across the financial technology sectors as well as academia, along with the observers from the Information Commissioner鈥檚 Office and the Centre for Data Ethics and Innovation.

The purpose of the Forum, launched by the Bank of England and the Financial Conduct Authority, is to facilitate dialogue between the public and private sectors to better understand the use and impact of AI in financial services, which will help further the Bank鈥檚 objective of promoting the safe adoption of this technology.

The AI Public-Private Forum, with an intended duration of one year, will consist of a series of quarterly meetings and workshops structured around three topics: data, model risk management, and governance.

Commenting on the initiative鈥檚 launch, the deputy governor for markets and banking at the BofE, David Ramsden said:

The existing regulatory landscape is somewhat fragmented when it comes to AI, with different pieces of regulation applying to different aspects of the AI pipeline, from data through model risk to governance. The policy must strike a balance between high-level principles and a more rules-based approach. We also need to future-proof our policy initiatives in a fast-changing field.

The specific aims of the Forum are: firstly, to share information and understand the practical challenges of using AI in financial services, identify existing or potential barriers to deployment, and consider any potential risks or trade-offs; secondly, to gather views on areas where principles, guidance, or regulation could support safe adoption of these technologies; and finally, to consider whether once the forum has completed its work ongoing industry input could be useful and if so, what form this could take.

The knowledge, experience, and expertise of the Forum鈥檚 members and observers will be invaluable in helping us to contextualise and frame the Bank鈥檚 thinking on AI, its benefits, its risk and challenges, and any possible future policy initiatives.

Fiona Browne, Head of AI at 糖心传媒, said:

I鈥檓 really excited and honoured to be part of such a timely forum. AI/ML services touch our everyday lives from recommending what we watch to groceries that we buy.

Within financial services, ML can offer efficiency benefits reducing manual time-consuming tasks, to saving customers money in suggesting best financial products to bespoke customer service solutions and fraud detection. These solutions need to sit within a legal and regulatory environment in the financial sector and are not without their risks and challenges.

I hope to offer the forum insights and experience of the practical implementation of ML-based on the areas of data quality and fairness through to transparency and explainability in the process and model predictions through to the monitoring of models in production. Excited to focus and tease out potential guidance and best practice on how to safely adopt and deploy such solutions.

What is the AI Public-Private Forum?

The BOE working with FCA have established the AIPPF (AI Public-Private Forum). This forum launched in October 2020 and consists of members reflecting a variety of views who applied to be on the forum bringing with them their expertise in the area of AI/ML. The AIPPF will:

  • Share information and understand the practical challenges of using AI/ML within financial services, as well as the barriers to deployment and potential risks.听
  • Gather views on potential areas where principles, guidance or good practice examples could be useful in supporting safe adoption of these technologies.听
  • Consider whether ongoing industry input could be useful and what form this could take (e.g. considering an FMSB-type structure or industry codes of conduct).听

More information about the Forum can be found .

The post 糖心传媒 contributes to Bank of England and FCA鈥檚 AI Public-Private Forum appeared first on 糖心传媒.

]]>
IRMAC Reflections with Dr. Fiona Browne /blog/ai-ml/irmac-reflections-with-dr-fiona-browne/ Mon, 07 Sep 2020 09:00:00 +0000 /?p=11379 There is a lot of anticipation surrounding Artificial Intelligence (Al) and Machine Learning (ML) in the media. Alongside the anticipation is speculation 鈥 including many articles placing fear into people by inferring that AI and ML will replace our jobs and automate our entire lives! Dr Fiona Browne, Head of AI at 糖心传媒 recently spoke at an IRMAC (Information […]

The post IRMAC Reflections with Dr. Fiona Browne appeared first on 糖心传媒.

]]>
There is a lot of anticipation surrounding Artificial Intelligence (Al) and Machine Learning (ML) in the media. Alongside the anticipation is speculation 鈥 including many articles placing fear into people by inferring that AI and ML will replace our jobs and automate our entire lives!

Dr Fiona Browne, Head of AI at 糖心传媒 recently spoke at an IRMAC (Information Resource Management Association of Canada) webinar, alongside Roger Vandomme, of Neos, to unpack what AI/ML is, some of the preconceptions, and the reasons why different approaches to ML are taken…  

IRMAC reflections with Dr Browne

What is AI/ ML? 

Dr. Browne clarified that whilst there is no official agreed-upon definition of AI, it can be depicted as the ability of a computer to perform cognitive tasks, such as voice/speech recognition, decision making, or visual perception. ML is a subset of AI, entailing different algorithms that learn from input data.  

A point that Roger brought up at IRMAC was that the algorithms learn to identify patterns within the data and the used patterns enable the ability to distinguish between different outcomes, for example, the detection of a fraudulent or non-fraudulent transaction.听

ML takes processes that are repetitive and automates them. At 糖心传媒, we are exploring the usage of AI and ML in our platform capabilities – Dr Fiona Browne

What are the different approaches to ML?  

Supervised, unsupervised, and reinforcement machine learning.  Dr. Browne communicated that at a broad level, there are three approaches: supervised, unsupervised, and reinforcement machine learning.  

In supervised ML, the model learns from a labelled training data set. For example, financial transactions would be labelled as either fraudulent or genuine fed into the ML model. The model then learns from this input and can distinguish the difference.  

Where data is unlabelled, Dr. Browne explained that unsupervised ML would be more appropriate, where the model learns from unlabelled data. There is a key difference here with supervised ML in that the model would seek to uncover clusters or patterns inherent in the data to enable it to separate them out.  

Finally, reinforcement machine learning involves models that continually learn and update from performing a task. For example, a computer algorithm learning how to play the game 鈥楪o鈥. This is achieved by the outputs of the model being validated and that validation being provided back to the model.  

The difference between supervised learning and reinforcement learning is that in supervised learning the training data has the answer key with it, meaning the model is trained with the correct answer.

In contrast to this, in reinforcement learning, there is no answer, but the reinforcement agent selects what to do to perform the specific task.

It is important to remember that if there is no training dataset present, it is bound to learn from its experience.  Often the biggest trial comes when a model is being transferred out of the training environment and into the real world.

Now that AI/ML and the different approaches have been unpacked… the next question is how does听explainability听fit into this? 听The next mini IRMAC reflection will unravel what听explainability听is and what the different approaches are. Stay tuned!听

Fiona has written an extensive piece on AI enabled data quality, feel free to check it out听here.

Click here for more by the author, or find us on ,  or  for the latest news.

The post IRMAC Reflections with Dr. Fiona Browne appeared first on 糖心传媒.

]]>
IRMAC Detective Data Work: AML and Emergent AI practices | 12/07/20 /events/irmac-webinar-aml-ai/ Wed, 01 Jul 2020 09:00:00 +0000 /?p=11745 Earlier this month, our Head of AI, Dr. Fiona Browne took part in the IRMAC webinar ‘Detective Data Work’ and explored the AML and emergent AI practices. Missed it? Watch the recording below: In this webinar,  the expert panellists questioned what anti-money laundering (AML) efforts look like, and the complexities in sifting through vast data […]

The post IRMAC Detective Data Work: AML and Emergent AI practices | 12/07/20 appeared first on 糖心传媒.

]]>
Earlier this month, our Head of AI, Dr. Fiona Browne took part in the IRMAC webinar ‘Detective Data Work’ and explored the AML and emergent AI practices.

Missed it? Watch the recording below:

In this webinar,  the expert panellists questioned what anti-money laundering (AML) efforts look like, and the complexities in sifting through vast data volumes, data quality and identification in an effort to make their findings 鈥榚xplainable鈥.

Reducing the money flow in criminal activities had a major boast after the events of 9/11/2001.

Now Artificial Intelligence (AI) and Machine Learning (ML) techniques are beginning to revolutionize practices in this field. – IRMAC

IRMAC

糖心传媒 Fiona:

Fiona Browne is Head of Artificial Intelligence at 糖心传媒 with over 15 years鈥 research and industrial experience. Prior to joining 糖心传媒, Fiona lectured in Computing Science at Ulster University teaching Data Analytics and undertaking research on applied artificial intelligence and data integration. She was a Research Fellow at Queen鈥檚 University Belfast and a Senior Software Developer at PathXL. Fiona received a BSc (Hons.) degree in Computing Science and a PhD on Artificial Intelligence in Bioinformatics from Ulster University.

糖心传媒 IRMAC:

The is a non-profit, vendor-independent association of information management and business professionals.

Our primary objective is to provide a forum for members to exchange information, experiences and promote the understanding, development and practice of managing information and data as a key enterprise asset.

The post IRMAC Detective Data Work: AML and Emergent AI practices | 12/07/20 appeared first on 糖心传媒.

]]>
Read how AI is transforming Data Quality in this exclusive white paper /blog/ai-ml/ai-whitepaper-data-quality/ Wed, 10 Jun 2020 20:00:43 +0000 /ai-enabled-dq/ 听 In this AI whitepaper, authored by our Head of AI Fiona Browne, we provide an overview of Artificial Intelligence (AI) and Machine Learning (ML) and their application to Data Quality. We highlight how tools in the 糖心传媒 platform can be used for key data preparation tasks including cleansing, feature engineering and dataset labelling for […]

The post Read how AI is transforming Data Quality in this exclusive white paper appeared first on 糖心传媒.

]]>

In this AI whitepaper, authored by our Head of AI we provide an overview of Artificial Intelligence (AI) and Machine Learning (ML) and their application to Data Quality.

We highlight how tools in the 糖心传媒 platform can be used for key data preparation tasks including cleansing, feature engineering and dataset labelling for input into ML models.

A real-world application of how ML can be used as an aid to improve consistency around manual processes is presented through an Entity Resolution Use Case.

In this case study we show how using ML reduced manual intervention tasks by 45% and improved data consistency within the process.

Having good quality, reliable and complete data provides businesses with a strong foundation to undertake tasks such as decision making and knowledge to strengthen their competitive position. It is estimated that poor data quality can cost an institution on average $15 million annually.听

As we continue to move into the era of real-time analytics and Artificial Intelligence (AI) and Machine Learning (ML) the role of quality data will continue to grow. For companies to remain competitive, they must have in place flexible data management practices underpinned by quality data.

AI/ML are being used for predictive tasks from fraud detection through to medical analytics. These techniques can also be used to improve data quality when applied to tasks such as data accuracy, consistency, and completeness of data along with the data management process itself.

In this whitepaper we will provide an overview of the AI/ML process and how 糖心传媒 tools can be applied in cleansing, deduplication, feature engineering and dataset labelling for input into ML models. We highlight a practical application of ML through an Entity Resolution Use Case which addresses inconstancies around manual tasks in this process.

The post Read how AI is transforming Data Quality in this exclusive white paper appeared first on 糖心传媒.

]]>
Explainable AI with Dr. Fiona Browne /blog/ai-ml/blog-ai-explainability/ Tue, 26 May 2020 18:19:57 +0000 /blog-ai-explainable/ The AI team at 糖心传媒 is building explainability from the ground up and demonstrating the 鈥渨hy and how鈥 behind predictive models for client projects. Matt Flenley prepared to open his brains to a rapid education session from Dr Fiona Browne and Kaixi Yang. One of the most hotly debated tech topics of 2020 concerns model […]

The post Explainable AI with Dr. Fiona Browne appeared first on 糖心传媒.

]]>
Dr Fiona Browne, 糖心传媒, discusses Explainable AI

The AI team at 糖心传媒 is building explainability from the ground up and demonstrating the 鈥渨hy and how鈥 behind predictive models for client projects.

Matt Flenley prepared to open his brains to a rapid education session from Dr Fiona Browne and Kaixi Yang.

One of the most hotly debated tech topics of 2020 concerns model interpretability, that is to say, the rationale of how an ML algorithm has made a decision or prediction. Nobody doubts that AI can deliver astonishing advances in capability and corresponding efficiencies in an effort, but as HSBC鈥檚 Chief Data Officer Lorraine Waters shared at a recent A-Team event, 鈥渋s it creepy to do this?鈥 Numerous agendas at conferences are filled with differing rationales for interpretability and explainability of models, whether business-driven, consumer-driven, or regulatory frameworks to enforce good behaviour, but these are typically ethical conversations first rather than technological ones. It鈥檚 clear we need to ensure technology is 鈥渋n the room鈥 on all of these drivers.

We need to be informed and guided by technology to see what tools are already available to help with understanding AI decision-making, how tech can help shed light on 鈥榖lack boxes鈥 just as much as we鈥檙e dreaming up possibilities for the use of those black boxes.

As Head of 糖心传媒鈥 AI team, Dr Fiona Browne has a strong desire for what she calls ‘baked-in explainability’. Her colleague Kaixi Yang explains more about explainable models,听

Some algorithms, such as neural networks (deep learning), are complex. Functions are calculated through approximation, from the network鈥檚 structure it is unclear how this approximation is determined. We need to understand the rationale behind the model鈥檚 prediction so that we can decide when or even whether to trust the model鈥檚 prediction, turning black boxes into glass boxes within data science.

The team puts their ‘explain first‘ approach to a specific client project to build explainable Artificial Intelligence (XAI) from the ground up, using explainability metrics including LIME 鈥 a local, interpretable, model-agnostic way of explaining individual predictions.

“Model-agnostic explanations are important because they can be applied to a wide range of ML classifiers, such as neural networks, random forests, or support vector machines鈥 continued Ms Yang, who has recently joined 糖心传媒 after completing an MSc in Data Analytics with Queen鈥檚 University in Belfast. 鈥They help to explain the predictions of any machine learning classifier and evaluate its usefulness in various tasks related to trust”.

For the work the team has been conducting, these range of explainability measures provides them with the ability to choose the most appropriate Machine Learning model and AI systems, not just the one that makes the most accurate predictions based on evaluation scores. This has had a significant impact on their work on Entity Resolution for Know Your Customer (KYC) processes, a classic problem of large, messy datasets that are hard to match, with painful penalties if it goes wrong for human users. The project, which is detailed in a recent webinar hosted with the Enterprise Data Management Council, matched entities from the Refinitiv PermID and Global LEI Foundation鈥檚 datasets and relied on human validation of rule-based matches to train a machine learning algorithm.

Dr Browne again: 鈥淲e applied different explainability metrics to three different classifiers that could predict whether a legal entity would match or not. We trained, validated and tested the models using an entity resolution dataset. For this analysis we selected听 two ‘black-box鈥’classifiers, and one interpretable classifier to illustrate how the explainability metrics were entirely agnostic and applicable regardless of the classifier that was chosen.”

The results are shown here:

explainability metrics in AI and ML

鈥淚n a regular ML conversation, these results indicate two reliably accurate models that could be deployed in production,鈥 continued Dr Browne, 鈥渂ut in an XAI world we want to shed light on how appropriate those models are.鈥

By applying, for example, LIME to a random instance in the dataset, the team can uncover the rationale behind the predictions made. 糖心传媒鈥 FlowDesigner rules studio automatically labelled this record as 鈥渘ot a match鈥 through its configurable fuzzy matching engines.

Dr Browne continued, 鈥explainability methods build an interpretable classifier based on similar instances to the selected instance from the different classifiers and summarises the features which are driving this prediction. It selects those instances that are quite close to the predicted instance, depending on the model that鈥檚 been built, and uses those predictions from the black-box model to build a glass-box model, where you can then describe what鈥檚 happening.

prediction probabilities in AI

In this case, for the Random Forest model (fig.), the label has been correctly predicted as 0 (not a match) and LIME exposes the features driving this decision. The prediction is supported by two key features but not a feature based on entity name which we know is important

Using LIME on the multilayer perceptron model (fig.), which had the same accuracy as Random Forest, it correctly predicted the 鈥0鈥 label of 鈥渘ot a match鈥 but with a lower support score. It has been supported by slightly different features compared to the random forest model.

prediction probabilities in AI

The Na茂ve Bayesian model was different altogether. 鈥淚t fully predicted the correct label of zero with a prediction confidence of one, the highest confidence possible,鈥 said Dr Browne, 鈥渉owever it鈥檚 made this prediction supported by only one feature, a match on the entity country, disregarding all other features. This would lead you to doubt whether it鈥檚 reliable as a prediction model.鈥

This has significant implications in something as riddled with differences in data fields as KYC data. People and businesses move, directors and beneficial owners resign, and new ones are appointed, and that鈥檚 without considering 鈥榖ad actors鈥 who are trying to hoodwink Anti-Money Laundering (AML) systems.听

explainability and interpretability in AI

The process of ‘phoenixing’, where a new entity rises from the ashes of a failed one, intentionally dodging the liabilities of the previous incarnation, frequently relies on truncations or mis-spellings of director鈥檚 names to avoid linking the new entity with the previous one.听

Any ML model being used on such a dataset would need to have this explainability baked-in to understand the reliability of predictions that the data is informing.

Using one explainability metric only is not good practice. Dr Browne explains 糖心传媒鈥 approach: 鈥淛ust as in classifiers, there鈥檚 no real best evaluation approach or explainer to pick; the best way is to choose a number of different models and metrics to try to describe what鈥檚 happening .There are always pros and cons, ranging from the scope of the explainer to stability of the code to complexity of the model and how and when it鈥檚 configured.鈥

These technological disciplines, to test, evaluate and try to understand a problem are a crucial part of the entire conversation that businesses are having at an ethical or 鈥渞isk appetite鈥 level.

Click听here for more from 糖心传媒, or find us on听,听听or听 for the latest news.

The post Explainable AI with Dr. Fiona Browne appeared first on 糖心传媒.

]]>
AI Bias – The Future is Accidentally Biased? /blog/marketing-insights/ai-bias-the-future-is-accidentally-biased/ Fri, 15 May 2020 10:21:51 +0000 /ai-bias-the-future-is-accidentally-biased/ AI Bias Every now and then a run-of-the-mill activity makes you sit up and take notice of something bigger than the task you’re working on, a sort of out-of-body experience where you see the macro instead of the micro. Yesterday was one such day. I’d had a pretty normal one of keeping across all the […]

The post AI Bias – The Future is Accidentally Biased? appeared first on 糖心传媒.

]]>
AI Bias

Every now and then a run-of-the-mill activity makes you sit up and take notice of something bigger than the task you’re working on, a sort of out-of-body experience where you see the macro instead of the micro.

AI Bias - The future is accidentally biased?

Yesterday was one such day. I’d had a pretty normal one of keeping across all the usual priorities and Teams calls, figuring out our editorial calendar and the upcoming听, all the while refreshing some buyer and user personas for our Self-Service Data Quality platform.

Buyer personas themselves are hardly a new thing, and they’re typically represented by an icon or avatar of the buyer or user themselves. This time, rather than pile all our hopes, dreams and expectations into a bunch of cartoons, I figured I’d experiment a little. Back in January I’d been to an听, where I’d heard about听Generative Adversarial Networks (GANs)听and the ability to use AI to create images of pretty much anything.

Being someone who likes to use tech first and ask questions later, I headed over to the always entertaining听听where GANs do a pretty stellar job of creating highly plausible-looking people who don’t exist (with some amusing if mildly perturbing issues at the limitations of its capability!). I clicked away, refreshing the page and copying people into my persona template, assigning our typical roles of Chief Data Officer, Data Steward, Chief Risk Officer and so on; it wasn’t until I found myself pasting them in that I realised how hard it was to generate images of people who were not white. Or indeed how it was impossible to generate anyone with a disability or a degenerative condition.

Buyer personas are supposed to reflect all aspects of likely users of the technology, yet this example of AI would unintentionally bias our product and market research activities to overlook people who did not conform to the AI’s model. My colleague Raghad Al-Shabandar wrote about this recently (), and I think probably the most impactful part of this, for me, anyway, was the following quote:

The question, then, is developing models for the society we wish to inhabit, not merely replicating the society we have.

In the website’s case, it’s even worse: it obliterates the society we currently have, by creating images that don’t reflect the diversity of reality, instead layering on an expected or predicted society that is over 50% white and 0% otherwise-abled.

I should make it clear that I’m a big fan of this tech, not least for the bafflement my kids have at the non-existence of a person who looks very much like a person! But at the same time, I think it perhaps exposes the risk all AI projects have – did we really think of every angle about what society looks like today, and did we consider how society ought to look?

These are subjective points that vary wildly from culture to culture and country to country, but we must ensure that every minority and element of diversity is in the room when we’re making such decisions or we risk baking-in bias before we’ve even begun.

Click听here听for the latest news from 糖心传媒, or find us on听,听听or听

The post AI Bias – The Future is Accidentally Biased? appeared first on 糖心传媒.

]]>