You are here

Data science, Big Data and Analytics
Data science - insights and knowledge from both traditional structured data, graph shaped data & time series/data streams

We've heard the “data is the new oil” phrase too many times already. But as businesses enter the new "software 2.0" era, the value of data is no longer just about the potential to gain insights to power data-driven decisions.

Data is both the foundation and the main ingredient for building successful AI solutions - you just can't apply Machine Learning to a problem unless you have data of sufficient quantity and quality to train your models. There are no two identical datasets, each having unique statistical properties, data quality challenges, and potential value and use cases for a particular business. Even if they are closely related, the right solution may be different, and developing AI solutions with publicly available datasets for training ML models is not good enough.

With storage costs plummeting, there is no longer any valid reason to decide what data to store long term and what data to throw away - all data is potentially valuable in the era of AI/ML, and all data can be cost-effectively stored (the right solutions can store terabytes at negligible storage costs and make multi-petabyte datasets storage feasible). But at the same time, one can't just store vast amounts of data with yet unknown uses in production databases, with catastrophic application performance consequences. Big Data technologies and processes need to be adopted to successfully manage large and ever-growing datasets that need to be cataloged for future data science use cases and satisfy compliance requirements.

Data science - from data to business valuable insights, and beyond

Traditional data science is about deriving insights and knowledge from data. At Tremend, our data science experience was gathered in realistic scenarios while analyzing client datasets, with the purpose of developing a multitude of AI/ML solutions, from churn prediction models, content and product recommender systems to anomaly and fraud detection.

This puts us in a unique position to tackle the next-generation data science tasks, moving from understanding present data patterns to predicting future trends, enabling our customers to anticipate market developments trends and ultimately empowering them to stay one step ahead of the competition.

From data science to data products

While a data science team can provide insight, knowledge, and predictions for the business teams like sales and product development to base their decisions on, this is only the first step.

Most data science processes can be automated and delivered as data products that BI/BI/sales/marketing teams can use autonomously to understand data and generate predictions.

Big Data management - effective storage & cataloging large amounts of data

Big Data may be overused as a marketing buzz-word, but it's also a reality: as businesses store more and more data, their datasets quickly grows beyond the sizes that can be processed on a single machine, forcing the adoption of "big data technologies" like data-processing frameworks made to run on clusters of machines.

This changes the context of data science tasks - analyses and model development needs to be done using different technologies like big data frameworks and services. In particular, big data mining needs to be employed to extract smaller datasets suitable for traditional data science and model development.

data science big data and analytics scheme

Beyond analytics

Analytics is a broad term encompassing both continuous gatherings of data concerning user behavior and systems operation and the continuous and automated analysis of this data.

Tremend has experience in the underlying technologies used to build enterprise-grade analytics pipelines, store and perform information extraction on a single data point and time-series related data, apply enhancement techniques on the data, and apply unsupervised AI/ML techniques for implementing anomaly detection and fraud detection.

We bring to the table the ability to develop complex analytics-oriented data products that can automatically detect patterns in your data and provide alerts and insights.

What can we do for you?

1. Data science

  • Insights from structured data
  • Data you’ve already collected about purchases, customer behavior, and other business events can contain insights valuable for solving business problems like improving customer retention / reducing churn.

    We've not only helped our customers use their data to understand the causes of customer churn, but we've also brought it one step further by developing churn prediction models that allowed sales teams to be proactive in improving customer retention for our financial services customers. We have experience ranging from classical statistical modeling to deep learning applied to structured data at the data-science level.

    Technology-wise, we can pick the right tool for the job starting with classic data science tools from the Python ecosystem (Pandas, scikit-learn, etc.), using neural networks and deep learning when/if it makes sense (employing industry-standard frameworks like TensorFlow, Keras, and/or Pytorch), and extending to Big Data frameworks like Spark and cloud technologies relevant for big data (Azure Databricks, AWS Glue & EMR, GCP Big Query).

  • Insights from graph data
  • Data revolves around relations between things. When these relations represent an essential characteristic of a dataset, we can talk about graph-structured data. Graph analysis has been featured in the news recently as a valuable tool for analyzing information, similar to the one used in the pandora papers, and working with this type of data should be a requirement for any modern data scientist.

    Tremend has research-grade experience in this area. We are active partners in academic research projects investigating fraud or money laundering detection on financial graph datasets using both novel mathematical methods and deep learning. Graphomaly was born out of the need to tackle the increase of fraudulent transactions across the European space. Currently, in the R&D phase, this advanced software package will be able to automatically discover illicit behavior like money laundering, illegal networks, tax evasion, and scams. Using AI algorithms and automated procedures to save time and money for companies and banks, this new tool will be able to process large graphs so that reaction time is decreased and, thus, frauds can be discovered in their incipient stages.

    Combined with our general AI/ML and general software-development expertise, we are uniquely positioned to tackle complex projects in this area - from data science to data/AI/ML product development. We have tons of experience using many technologies from industry-standard graph analysis frameworks (NetworkX etc.), technologies for applying neural networks to graph-structured data, and in-house developed graph analysis frameworks (some pending to be open-sourced/developed as part of research grants).

  • Insights from time-series datasets
  • Both analytics datasets and financial datasets often come in as streams of time-stamped events containing various attributes. Our experience in handling such datasets for anomaly detection in sensor/analytics data and fraud detection in financial transactions data ranges from initial data analysis and data science to the development and deployment of AI/ML models to process such data.

    At the technology level, we also have experience integrating such technologies into enterprise-grade analytics pipelines (ELK stack). Custom data products opportunity discovery and planning - uncover opportunities for data products tailor-made to your available data and solve the problems you encounter.

  • Insights from unstructured data
  • Our expertise with using AI/ML for computer vision tasks gives us the ability to handle data science problems on datasets with the most common type of unstructured data - images and videos. We have experience implementing integrated ID verification, face recognition & liveness detection solutions and deploying image object detection systems for big data contexts to extract structured data from large images datasets.

2. Customized AutoML solutions

Modern AutoML solutions are evolving fast, leading to the emergence of "light AI/ML" solutions that can be deployed using only basic data-science skills instead of requiring extensive AI/ML models and solution development expertise. This is, of course, a solution only adequate in limited scenarios and for prototypes building, but custom AutoML solutions can greatly augment the performance of an existing BI/BA team.

3. Big Data management strategy and architecture - data lakes & data warehouses, cloud-based or on-premises

Having a solid Big Data strategy and architecture, with a separate data lake and data warehouses, implemented with technologies adequate for each customer’s needs and constraints is crucial. We can provide the expertise for designing, implementing, and operating such solutions.

Our experience encompasses using cloud-based fully-managed cloud solutions (AWS Glue, AWS Athena, Azure Databricks, GCP Big Query) and partially-managed cloud solutions (AWS EMR) for cost-effective handling of Big Data workloads.

Contact us to find out more about our vast array of data science solutions tailored to your business needs.


Find out how we can generate value for your business

9 + 4 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.