<br> # .center[EMBRACING THE BLESSING OF DIMENSIONALITY TO DETERMINE SPECIES’ PROVENANCE] <br><br><br><br> ### Kevin Cazelles .small[![:faic](twitter) [KCazelles](https://twitter.com/KCazelles)] <img src="img/map.png" style="position: absolute; bottom: -200px; right: -200px" width = 600px;></img> #### Emelia Myles-Gonzalez, Tyler Zemlak, Kevin S. McCann .small[[@McCannLab](https://www.mccannlab.ca/)] <br><br> #### ![:faic](globe) QCBS Annual symposium, Montreal #### ![:faic](calendar) December 18th, 2019 #### ![:faic](github) [KevCaz/fightingNoise](https://github.com/KevCaz/fightingNoise) --- class: inverse, center, middle # Context <hr> ## The push for provenance --- # The push for provenance <br> > Consumers are become increasingly aware of, and interested in, the origins of their seafood, particularly as issues such as environmental sustainability, impacts on endangered species, toxin accumulations, incidents of illegal, unregulated and unreported (IUU) fishing, quality assurances and human rights abuses are better understood. ![:faic](file-text) Roebuck, K. et al. (2017) [Canadian's Eating in the Dark: A Report Card of International Seafood Labelling Requirements](http://www.seachoice.org/wp-content/uploads/2017/03/Seafood-Labelling-Report-Online.pdf) --- # The push for provenance .center[![:scale 52%](img/cardreport.png)] ![:faic](file-text) Roebuck, K. et al. (2017) [Canadian's Eating in the Dark: A Report Card of International Seafood Labelling Requirements](http://www.seachoice.org/wp-content/uploads/2017/03/Seafood-Labelling-Report-Online.pdf) --- # Sockeye salmon (*Oncorhynchus nerka*) .center[![:scale 85%](img/sockeye.png)] ![:faic](file-text) Crawford, S.S. & Muir, A.M. (2008). [Global introductions of salmon and trout in the genus Oncorhynchus: 1870–2007](https://www.researchgate.net/publication/225224390_Global_introductions_of_salmon_and_trout_in_the_genus_Oncorhynchus_1870-2007). Reviews in Fish Biology and Fisheries. --- # The push for provenance <br> ### 1. We need regulations #### .large[*e.g.* Country of Origin Labelling (COOL) in USA] <br> -- ### 2. We need tools to authenticate the provenance #### .large[[What this talk is about!]()] --- # Biotracers to create spatial fingerprints <br> ## Bio + tracer ### - trace elements ### - stable isotopes ### - fatty acids ### - DNA ### - gut contents ### - etc. --- # Vertical vs horizontal strategies <br> > The future of food authentication and food quality assurance critically depends on combining chemometrics, computational analytical methods, and bioinformatics in processing and interpreting the data obtained through analytical technique. ![:faic](file-text) Danezis, G. P., et al. (2016) [Food authentication: state of the art and prospects](https://www.sciencedirect.com/science/article/pii/S2214799316300844). *Current Opinion in Food Science* <br> --- # Why develop horizontal strategies? <br> .center[![:scale 95%](img/frout.png)] ![:faic](file-text) Chen, Dong, et al. (2013) [Blessing of Dimensionality: High-Dimensional Feature and Its Efficient Compression for Face Verification](https://www.cv-foundation.org/openaccess/content_cvpr_2013/html/Chen_Blessing_of_Dimensionality_2013_CVPR_paper.html). 2013 IEEE Conference on Computer Vision and Pattern Recognition. --- <br><br><br><br><br><br> # [HOW TO COMBINE BIO-TRACERS TO DETERMINE SPECIES PROVENANCE?]() --- class: inverse, center, middle # A Bayesian framework --- # Where does this sample come from? .center[![:scale 75%](img/map2a.png)] --- # Where does this sample come from? .center[![:scale 75%](img/map2b.png)] --- # A Naive Bayes classifier -- <br> ## - A priori knowledge ## - Sample (observations) <br> -- ## - Quantity, quality, heterogeneity <br> -- ## - Statistical model --- # Origins and distributions <br> <img src="index_files/figure-html/figA2-1.png" style="display: block; margin: auto;" /> --- # Where does this sample come from? .center[![:scale 75%](img/map2c.png)] --- # Problem <img src="index_files/figure-html/figA3-1.png" style="display: block; margin: auto;" /> -- ## O<sub>1</sub> | Sample ? -- ## Sample | O<sub>1</sub> & Sample | O<sub>2</sub> known! --- # A bayesian approach <br><br> ## [Assumptions]() ### 1. sample from one of the areas considered ### 2. no mixture <br> `$$[O_1|S_{n}] = \frac{[S_{n}|O_1][O_1]}{\sum_j [S_{n}|O_j][O_j]}$$` --- # Similarity & authentication <br> .center[![](img/FigSd1.png)] --- # Similarity & authentication <br> .center[![](img/FigSd2.png)] --- # Similarity & authentication <br> .center[![](img/FigSd3.png)] --- # Similarity & authentication <br> .center[![](img/FigSd4.png)] --- # Similarity & authentication <br> .center[![](img/FigSd5.png)] --- # Similarity & authentication <br> .center[![](img/FigSd6.png)] --- # Similarity & authentication <br> .center[![](img/FigSd7.png)] --- # Dimensionality & authentication <br> ## [Assumption](): Bio-tracers ~ N(μ, Σ) ### just as in a [Linear Discriminant Analysis (LDA)]() <br> -- ### μ: means - fixed ### Σ: covariance matrix #### diagonal (variances) fixed #### off-diagonal terms (symmetric) vary: --- # Dimensionality & authentication .center[![:scale 90%](img/FigCor1.png)] --- # Dimensionality & authentication .center[![:scale 90%](img/FigCor2.png)] --- # Dimensionality & authentication .center[![:scale 90%](img/FigCor3.png)] --- # Dimensionality & authentication .center[![:scale 90%](img/FigCor4.png)] --- # Dimensionality & authentication .center[![:scale 90%](img/FigCor5.png)] --- # Dimensionality & authentication .center[![:scale 90%](img/FigCor6.png)] --- # Dimensionality & authentication .center[![:scale 90%](img/FigCor7.png)] --- # Dimensionality & authentication .center[![:scale 90%](img/FigCor8.png)] --- # Dimensionality & authentication .center[![:scale 90%](img/FigCor9.png)] --- # Dimensionality & authentication .center[![:scale 90%](img/FigCor10.png)] --- # Dimensionality & authentication .center[![:scale 90%](img/FigCor11.png)] --- # Dimensionality & authentication .center[![:scale 80%](img/fig_theo3a.png)] --- # Dimensionality & authentication .center[![:scale 80%](img/fig_theo3b.png)] -- ## .center[[Blessing of dimensionality!]()] --- # So <br><br> -- ## 1. the more dissimilar the distributions the better ## 2. the higher the dimensionality the better <br> -- ## 3. Assuming extra bio-tracers enhance 1. or 2. (or both) -- <br> ## [THE MORE BIO-TRACERS THE BETTER]() --- class: inverse, center, middle # Using the framework --- # Data <br> .column[ ### - Sockeye Salmon ### - 3 regions ### - 30 individuals/region ### - bio-tracers: ### 3 stable isotopes ### 14 fatty acids ] .column[ ![:scale 48%](img/map2d.png) ] --- # How? <br> ### - A priori knowledge 20 individuals | test on 10 individuals ### - All combinations of bio-tracers ### - Custom Bayes classifier / [LDA]() / Deep Learning <br> -- ### cor(RU, US) > cor(CA, US) = cor(CA, RU) --- # Results - increasing sample size ### 3 bio-tracers .center[![:scale 73%](img/res1.png)] --- # Results - increasing dimensionality ### sample size = 1 .center[![:scale 74%](img/res2.png)] --- class: inverse, center, middle # Perspectives --- # Perspectives <br> ## [Limits]() ### Unpack results about correlation structure ### Mixture problem ### Temporal variations --- # Perspectives <br> ## [Future]() ### Let's think vertical ### Can we use this to create reliable maps of probabilities of origin? ![](img/fig_map2a.png) - ![:faic](file-text) Magozzi, S., et al. (2017) [Using ocean models to predict spatial and temporal variation in marine carbon isotopes.](https://data.giss.nasa.gov/o18data/grid.html) *Ecosphere* --- # Many thanks to <br><br> -- ### My co-authors and [the McCann Lab](https://www.mccannlab.org/) <br><br> -- ### [Food From Thought](https://arrellfoodinstitute.ca/food-from-thought/) ![:scalec 50%](img/CanadaFirst.png) --- class: inverse, center, middle # THE END