Over sampling and under sampling

Over sampling and under sampling are two common methods used to deal with imbalanced data sets, where one class is much more represented than the other. Over sampling involves duplicating minority class examples until the class is balanced, while under sampling involves removing majority class examples until the class is balanced. Both methods have advantages … Read more

Econometrics

Econometrics is the application of statistical methods to economic data in order to give empirical content to economic relationships. Econometrics is used in a variety of areas, such as in the testing and estimation of economic models, in the analysis of economic data, and in forecasting. Is econometrics the same as economics? No, econometrics is … Read more

De-anonymization (deanonymization)

De-anonymization (deanonymization) is the process of taking data that has been anonymized and using it to identify specific individuals. This can be done through a variety of methods, including linking data sets, using known information about an individual, or using sophisticated data analysis techniques. De-anonymization can have serious consequences for individuals, as it can lead … Read more

In-memory database

An in-memory database is a database that resides in memory. In-memory databases are often used for high-performance applications where response times are critical. In-memory databases can be either write-through or write-back. Write-through in-memory databases write to both the in-memory copy and the persistent copy of the database simultaneously. Write-back in-memory databases write to the in-memory … Read more

Anonymous video analytics (AVA)

AVA is a term used to describe a video analytics solution that does not require any personal information to be collected in order to function. AVA solutions are typically used for security or marketing purposes, and can be deployed in a variety of settings, including public spaces, retail stores, and office buildings. AVA solutions typically … Read more

Zipf’s Law

Zipf’s law is a statistical observation that is often made about the distribution of data. It states that, in many cases, the frequency of an event is inversely proportional to its rank. In other words, the second most common event is half as likely to occur as the most common event, the third most common … Read more

Gartner hype cycle

The Gartner hype cycle is a tool used by analysts to measure the maturity, adoption and social application of specific technologies. The cycle is divided into five phases: 1. Technology Trigger: A potential technology breakthrough kicks off the hype cycle. Early proof-of-concept stories and media interest trigger significant press coverage. Often no usable products exist … Read more

Decision tree

A decision tree is a decision support tool that uses a tree-like model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. It is one way to display an algorithm that only considers one factor at a time and allows for different options depending on the situation. A decision tree … Read more

GoodData

“GoodData” is a cloud-based data analytics platform that enables users to securely connect to data sources, prepare and clean data, and build and share custom reports and dashboards. GoodData also offers a suite of pre-built applications for specific verticals, such as e-commerce, retail, and financial services. Who owns GoodData? GoodData is a data analytics company … Read more

Time series chart

A time series chart is a graphical representation of data points that are plotted over time. This type of chart is typically used to visualize trends or patterns in data over a period of time. Time series charts can be used to track changes in data over time, such as changes in a company’s stock … Read more