NMR Structure Calculation:
Backbone and side-chain assignments were obtained for over 90% of the residues for A1LS and the J2LS fragments using standard triple-resonance experiments. Secondary structure was estimated from the available chemical shifts using TALOS and found to be consistent with the proposed structures (S2, S3) [48]. Cystine CB chemical shifts were uniformly >33 ppm in the oxidized form suggesting that they were all involved in disulfide bonds, providing further verification of the final structures (S4).
NOESY spectra in conjunction with the backbone and side chain assignments were used to calculate the 3D structure via the PONDEROSA server[49, 50]. All the LS fragments has a paucity of true long-range NOEs constraints, even relative to other vicilin precursor fragments (S6). As such, multiple iterations of CYANA were employed to identify putative long-range NOEs [51], which were verified manually and resubmitted to the PONDEROSA for further refinement until no new putative long-range NOEs were suggested by the software. NMR structural statistics were calculated using the Protein Structure Validation Software (PSVS) suite are shown in S6 [52].

Statistical analysis of microarray data

Prior to analysis, we collected the median signal to noise ratio (SNR) for each peptide spot contained in the J2LS and A1LS for each patient. We chose SNR as our signal measurement as it corrects the raw signal intensity for non-specific hybridization and instrument noise (link). Each SNR value was then converted into modified z-scores which are calculated using median and median absolute deviation (MAD) rather than mean and standard deviation. Z-scores represent a standardized signal intensity and are frequently used to report immunoglobulin binding to peptide microarray [53]. Median and MAD were calculated for each patient and leader sequence combination. MAD was calculated with themad function in R using the constant of 1.4826 to approximate standard deviation. After calculating patient and LS-specific median and MAD, z-scores were calculated by subtracting the median from a relevant spot SNR and dividing this value by the MAD. We defined a true IgE binding event as a SNR with a converted Z score >= 3.

Statistical analyses were performed using R (version 3.6.3; R Development Core Team, available at www.r-project.org). A goal of this study was to assess cross-reactivity among walnut-, peanut- or co-allergic individuals to regions of A1LS and J2LS. For each LS and predicted J2LS fragment, we counted the number of significant IgE binding events (z-value > 3) to their representative microarray peptides for each allergy type: walnut (WN), peanut (PN), or co-allergy (PW). For each LS or J2LS fragment we determined the percent of peptides bound by IgE for each allergy group. For example, JR2.1 is represented by 11 peptides on each microarray and there are 12 total walnut allergic individuals meaning that there is a total of 132 possible IgE binding events. As 42 of these peptides exhibited IgE binding with a z-value >3, we determined that walnut-allergic patients bound 32% of the possible JR2.1 microarray peptides. We then used the fisher.test function to perform Fisher’s exact tests to compare the percent of peptides with significant IgE binding among the allergy groups for each LS or J2LS fragment.