A glossary for professionals & the curious.

AI Autonomy : The ability of AI systems to make decisions and act independently, while adhering to ethical and legal standards.

AI equity : The goal of ensuring that AI systems do not discriminate or privilege certain individuals or groups based on their race, gender, age, sexual orientation, etc.

AI impact assessment : Assessing the effects of AI on individuals, organizations, and society as a whole to minimize negative impacts and maximize positive benefits.

AI interpretability : The ability to understand how AI systems make decisions, so they can be explained and improved.

AI sustainability : Designing and using AI responsibly to minimize its long-term environmental, economic and social impact.

Algorithm : A sequence of operating rules that is executed on data to obtain a result.

Life and private security of AI : Protecting sensitive user data, preventing data breaches, and securing AI systems from cyberattacks.

Megadata analysis : A research technique that analyzes large volumes of data using algorithms, specialized computer tools, or artificial intelligence systems to obtain actionable or decision-making information. The analysis of megadata makes it possible to highlight correlations and underlying structures that are difficult to detect in a mass of raw data. It provides elements of understanding.

Machine Learning : a field of artificial intelligence that focuses on the development of models that can learn from data.

Deep Learning : a subfield of machine learning that uses artificial neural networks with multiple layers to learn hierarchical data representations.

Federated Learning : a distributed machine learning method in which data is held locally on users’ devices, rather than being collected on a centralized server.

Unsupervised learning : a type of machine learning in which the model is trained on an unlabeled dataset, i.e., data where the expected results are not known.

Transfer Learning : a method of machine learning in which a model pre-trained on one task can be adapted to another task without the need for a large data set.

Supervised learning : a type of machine learning in which the model is trained on a labeled data set, i.e. data where the expected results are known.

Reinforcement Learning : a type of machine learning in which a model learns to make decisions by interacting with an environment and receiving rewards or penalties for its actions.

Database : A structured set of data elements, usually in the form of tables, in which the data is organized in such a way that it can be used. A database can be composed of a single file, itself containing several tables / a term used to describe large and complex data sets that are difficult to process with traditional tools.

Knowledge base : A database containing all the information integrated into an artificial intelligence system. The knowledge base is usually part of a knowledge-based system.

Value Chain : Describes the process of transforming raw data into a valuable element.

Chatbot : is a computer program designed to simulate a conversation with human users, using predefined rules or artificial intelligence. Chatbots can be used to answer common questions, help solve common problems, or simply engage in conversations with users.

Classification : a machine learning algorithm that predicts the class of an object based on specific characteristics.

Dashboards : refers to a graphical user interface that allows users to visualize and analyze data in a concise and easily understandable way

Discoverability : refers to the ability of a product, service or content to be found and discovered by users who need or seek information on a specific topic. We talk about building structured data that is open and linked so that the content can be found by search engines and read by robots.

Big data : A set of digital data that, due to its volume, exceeds human intuition and analysis capacities. On the Internet, we produce some 2.5 trillion bytes of data every day: emails, videos, weather information, GPS signals, online transactions, etc. No traditional database management tool can handle this mass of data: it has required the development of new algorithms, in order to store, classify and analyze it.

Recommendation engines : Recommender engines (or recommender systems) are computer algorithms designed to suggest relevant items (e.g., products, videos, or articles) to a user based on their preferences and usage history. These systems typically rely on data such as browsing history, previous purchases, ratings and stated user preferences to generate personalized recommendations.

ETL (extract / transform / Load): The process of 1. extracting data from its original source / 2. transforming the data by duplicating, combining, and ensuring its quality, and then / 3. loading the data into the target database. The data cleansing and transformation step (2) is crucial because data is often not initially collected for its intended use (think of transaction data that is first collected for operational purposes, and then could be used to predict sales). This is even more true today with Big Data. Although it brings little value, this step often represents 70% of the time spent on a data enhancement project.

Data mining : The process of searching and analyzing data to find hidden correlations or new information, or to identify certain trends.

User guide : The creation of a user guide at the end of a project is important to ensure that the product is fully understandable for its future users. It results from a translation of scientific practices into logical practices to facilitate its use.

Artificial intelligence : Field of study that aims to artificially reproduce the cognitive faculties of human intelligence in order to create systems or machines capable of performing functions that normally fall within the scope of human intelligence. The field of computer science that focuses on the development of intelligent systems capable of perceiving their environment, reasoning and acting autonomously.

Deliverable : This means the end result of the project. This end is called deliverable and is the result of a real production, apprehensible and measurable by the client.

Megadata : A set of very large amounts of data, structured or not, in different formats and from multiple sources, which are collected, stored, processed and analyzed in a short time frame, and which are impossible to manage with traditional database management or information management tools.

Reasoning : A process by which a computer system performs a logical sequence, based on initial propositions and a knowledge base, to arrive at a conclusion.

Linear regression : A machine learning model that predicts a continuous variable (i.e., numerical data) from other variables.

Neural Networks : a deep learning model inspired by the workings of the human brain, which is able to recognize complex patterns in data. Neural networks are used, among other things, to generate content (images, videos, text, etc.).

ROI (return on investment): Indicator that is calculated to know the relative return on profit compared to the capital invested. The ROI of an AI project is not only financial, we can include a social, environmental, health or marketing dimension.

MLOps : short for machine learning operations, MLOps aims to design learning models adapted to their deployment in production, and then to maintain them throughout their lifecycle.

Data pooling : consists of combining the data and expertise of several players. This type of collaboration allows to improve its capacities and activities with the strength of the group.

Packaging : Making the product usable, viable in order to present it.

Pipeline : A linear sequence of specialized modules used for pipelining.

Quality Assurance (QA): is any systematic process for determining whether a product or service meets specified requirements. QA establishes and maintains defined requirements for the development or manufacture of reliable products.

Data Scientist : analyzes and interprets complex data and models it in AI algorithms to describe, diagnose and predict phenomena.

Technological solutionism : a tendency to believe that social, economic, or political problems can be solved primarily, if not exclusively, by technology. It is a belief in the universal effectiveness of technology to solve humanity’s problems, which can be associated with a utopian or techno-optimistic worldview.

Natural Language Processing : Natural languages are the ordinary languages we use between human beings to exchange and understand each other, as opposed to programming languages, for example. This is a field of artificial intelligence that focuses on understanding and generating this type of language.

VPS (virtual personal server): The use of this type of server allows to work on data in a dedicated environment/country. This allows, therefore, to adapt the hosting of data to the needs of the organization.