TrueSet is Scientific Consilience's proprietary biomarker selection service. The whole workflow has been established regarding peculiarities of biological data together with proprietary developments in information theory and statistics. The top-down approach from general biological data to a special kind of information, e.g. mRNA expression values, allows for processing accurately different types of biomarkers. The workflow itself depends on the actual assignment. A typical workflow with quality control is given as follows.

Data inspection
When we receive data we perform preliminary checks for coherence and consistency based upon information about data background we obtained from our customers. Usually, we detect labeling errors like misspelling or data that do not fit to the (assumed) data source. If we detect such inconsistencies we immediately give detailed feedback to enable our customers for the clarification.

Data preprocessing
Usually, data have to be normalized. At its simplest, instrumentation samples are typically processed on different days and sometimes with different measuring devices. Hence, we need methods to reduce the technical variability. Data normalization depends on the data set which method is the most appropriate approach. If not known a priori due to information about the data source Scientific Consilience performs data distribution examinations to find the most suitable normalization technique. Furthermore, data typically contain outliers. In particular, the detection and the cleaning of datasets from outliers is a crucial task in data preprocessing. Outliers are carefully handled to avoid distortion and manipulation of statistical results.

Feature selection
We have developed proprietary software - with the same name TrueSet - combining information theoretical criteria with statistical evaluations to identify the biomarkers with the highest predictive power.

This software is the main part of our biomarker selection service.

Our highly parallelized software allows for the use of computationally expensive and extended information theoretical criteria. Furthermore, these routines have a unique technique available to correct and adjust themselves. That way, the software becomes acquainted with actual peculiarities of the data and promises high accuracy. We usually identify the optimal number of features and then select the best features for subsets of a given size.

Typically, our customers have concrete ideas about result presentation. Our results are postprocessed, e.g. figures, tables, reports are generated, and presented in the appropriate fashion. Finally, results are discussed and, depending on the actual assignment, we give hints on how to proceed based on our experience.

Please contact us for your individual offer under

TrueSet (at)

or use our contact form to get in touch with us right now.