The immune system contains traces of past and ongoing diseases. New technology allows us to sequence the immune system to discover patterns in the data for diagnosing and prognosticating on disease. The features associated with each label are complex and do not conform to rows in columns in a spreadsheet. Some people call this data unstructured, which is a misnomer, because the data has structure, albeit a different kind of structure. For that reason, we call the data non-conforming.
Take a look at two datasets of T-cell receptors to gain a deeper insight.
We will be publishing approaches to cope with the datasets over the coming months.
Source: Reddit Machine Learning