PRIMsrc

Welcome to The Bump Hunting Project by Patient Rule Induction Method. This website hosts a brief description of the goal of the project and its software `PRIMsrc`. It describes why and how you can use the software and provides some general remarks and links about it.

Overview

The general problem in "Bump Hunting" (BH) is to identify, characterize and predict hidden structures in the data that are informative and significant. In practice, "Bump Hunting" refers to the task of mapping out local regions of the input space (attribute/feature/predictor) where a target function of interest, usually unknown, assumes larger (or smaller) values than its average over the entire space. These sought-after regions of extreme values in the target function are also known as local/global extrema supports. The input space to perform the "Bump Hunting" search may be any low or high-dimensional space where inputs may be any variables such as attributes, features, predictors, etc. The target function may be any function of interest. See the Wiki page for details.

The picture below illustrates the idea. The sunshine over the mountain range shows how light can uncover peaks, highlands and valleys, just like we want to do for data structures in the target function by "Bump Hunting".

"Bump Hunting" applies to mathematical / statistical problems such as:

Mode(s) Hunting
Local/Global Extremum(a) Finding
Subgroup(s) Identification
Outlier(s) Detection
…

PRIMsrc implements a unified treatment of the "Bump Hunting" task in high-dimensional space. It uses a generic rule-induction algorithm by recursive peelings derived from the Patient Rule Induction Method (PRIM), initially introduced by Fisher & Friedman in 1999 (see Wiki "References"). It generates simple decision rules delineating a region (or regions) in the multi-dimensional input space, where the target function is unusually larger (or smaller) than its average over the entire space.

Why Use PRIMsrc?

The fact that the method (i) makes minimal assumptions about the data, (ii) gives easily interpretable rules with estimated variance and (iii) can target for any desired responses (being supervised for Survival, Regression and Classification (SRC) settings), makes it highly attractive to the user.

Unlike classical regression, classification and clustering problems, "Bump Hunting" is interested in:

Understanding and characterizing newly identified sub-groups of samples and homogeneous sub-populations
Discovering and describing sub-groups of samples and sub-populations with extreme responses
Identifying and predicting future sub-groups of samples and sub-populations with extreme responses
Customizing and/or targeting sub-groups of samples and sub-populations with extreme responses
…

Multiple applications exist in an increasing range of problems spanning from Medical, Engineering, Materials Research, Marketing, Business Analytics, Actuarial Science, Behavioral Science, etc... :

subgroup finding
disparity subtyping
alternative drug/treatment indication (re-purposing)
personalized medicine (improved accuracy of diagnostication and/or prognostication)
economical medicine (hot spotting)
system reliability analysis in engineering
duration analysis/modeling in economics
event history analysis in sociology
financial securities return
insurance risk assessment/management
…

Readme

Visit the software Readme webpage to learn about License, Downloads, Branches, Requirements, Installation and Usage

Wiki

Visit the project Wiki webpage for Roadmap, Documentation ,Examples, Publications, Case Studies, Support and How to Contribute (code and documentation).

Authors/Contributors

Jean-Eudes Dazard, PhD.
Center for Proteomics and Bioinformatics (at the time of study/design)
Case Western Reserve University
Cleveland, Ohio, USA

J. Sunil Rao, PhD.
Division of Biostatistics
Department of Epidemiology and Public Health
The University of Miami
Miami, Florida, USA

Michael LeBlanc, PhD.
Fred Hutchinson Cancer Research Center
Public Health Sciences.
Department of Biostatistics, School of Public Health
The University of Washington
Seattle, Washington, USA

Michael Choe, MD.
Case Western Reserve University (at the time of study/design)
Cleveland, Ohio, USA

Tarn Duong, PhD.
Research scientist
Computer Science Laboratory (LIPN)
University of Paris 13
Paris, France

Acknowledgements

Project funded in part by the National Institute of Health - National Cancer Institute, Grant: R01-CA160593 awarded to J.Sunil Rao/J-E. Dazard (co-PIs). This work was also made possible thanks to the help of Alberto Santana, MBA (Analyst Programmer, CWRU) and the High Performance Computing Resource in the Core Facility for Advanced Research Computing at Case Western Reserve University. Thanks also to professional photographer Bill Wight CA for the nice illustration picture above.

web counter

Bump Hunting
by
Patient Rule Induction Method

(PRIMsrc)

Welcome to The Bump Hunting Project by Patient Rule Induction Method. This website hosts a brief description of the goal of the project and its software `PRIMsrc`. It describes why and how you can use the software and provides some general remarks and links about it.

Overview

Why Use PRIMsrc?

Readme

Wiki

Authors/Contributors

Acknowledgements

Bump Hunting by Patient Rule Induction Method

(PRIMsrc)

Welcome to The Bump Hunting Project by Patient Rule Induction Method. This website hosts a brief description of the goal of the project and its software PRIMsrc. It describes why and how you can use the software and provides some general remarks and links about it.

Overview

Why Use PRIMsrc?

Readme

Wiki

Authors/Contributors

Acknowledgements

Bump Hunting
by
Patient Rule Induction Method

Welcome to The Bump Hunting Project by Patient Rule Induction Method. This website hosts a brief description of the goal of the project and its software `PRIMsrc`. It describes why and how you can use the software and provides some general remarks and links about it.