Embryonic stem cell culture
J1 and R1 mESCs (gift from Howard Y. Chang) were grown on 0.2% gelatin-coated (Sigma G6144) tissue culture plates. Cells were maintained for a minimum of ten passages in mESC serum medium containing KnockOut™ D-MEM (Life Technologies 10829), 15% HyClone™ fetal bovine serum (FBS; Thermo Scientific™ SH30396.03), 100 U/ml Penicillin-Streptomycin (P/S; Life Technologies 15140), 1% MEM non-essential amino acids (Life Technologies 11140), 1% GlutaMAX™ (Life Technologies 35050), and 0.1 mM 2-mercaptoethanol (Life Technologies 21985). Mouse leukemia inhibitory factor (LIF; Millipore ESG1107; 1000 U/mL), purified Royalactin (0.5 mg/mL), or purified NHLRC3 (0.125 mg/mL) were added to culture as indicated. Cells were passaged every 2 to 3 days using Trypsin-EDTA (0.25%; Life Technologies 25200) and seeded at 15,000 cells/cm2. Media and protein were changed daily.
For cultures under 2i conditions, J1, R1, and Rex1-GFP mESCs were grown on Poly-L-ornithine (Sigma-Aldrich)/Laminin (Fisher Scientific 23017–015) coated plates and CGR8.8 mESCs were grown on Matrigel (Corning 354277) plates coated according to manufacturer’s specifications. All cells were maintained a minimum of ten passages in serum-free media containing: 1:1 Neurobasal:DMEM-F12 base (Thermo Scientific 21103049), 1% Glutamax, 1% high insulin N2 (Thermo Scientific 17502001), 1% B27 supplement (Thermo Scientific 12587001), 0.1 mM 2-mercaptoethanol, 1% Penicillin-Streptomycin, 1% MEM non-essential amino acids, 1% sodium pyruvate (Thermo Scientific 11360070). Mouse leukemia inhibitory factor (LIF; Millipore ESG1107; 1000 U/mL), 1 μM PD0325901 (Selleckchem), 3 μM CHIR99021 (Selleckchem), purified Royalactin (0.5 mg/mL), or purified NHLRC3 (0.125 mg/mL) were added to cultures as indicated. Cells were passaged every 2 to 3 days using Accutase (Stemcell Technologies 07920) and seeded at 15,000 cells/cm2. Media and protein were changed daily.
Production of recombinant Royalactin and NHLRC3
FLAG-MRJP1-His (Genbank ID# NM_001011579.1) was cloned into LakePharma’s proprietary antibiotic selection vector and transfected by electroporation into suspension CHO parental cells. The FLAG-MRJP1-His stable cell line was generated after 2 weeks of antibiotic selection period. Purification of FLAG-MRJP1-His was achieved by a two-step chromatography method. Conditioned media was centrifuged, filtered, and loaded onto an anion exchange chromatography (AEX) resin pre-equilibrated with 20 mM Tris pH 7.5. FLAG-MRJP1-His was eluted by increasing sodium chloride concentration and fractions containing the protein were pooled together. This sample was further polished by a second immobilized metal (Ni) affinity chromatography (IMAC) step using increasing concentrations of imidazole for elution. SDS-PAGE was performed for each fraction and the ones containing FLAG-MRJP1-His were pooled together for dialysis. The final formulation for FLAG-MRJP1-His was 200 mM NaCl in 30 mM HEPES pH 7.0.
For production of supernatants containing recombinant NHLRC3, suspension CHO cells were seeded at 350,000 cells/mL into 90% CD OptiCHO Medium (Thermo Scientific 12681011) containing 6mM L-glutamine (Thermo Scientific 25030–081) and 10% CHO CD EfficientFeed (Thermo Scientific A1023401). CHO cells were grown at 37 °C for 5 days, shaking constantly. The cell suspension was centrifuged at 10,000 × g for 40 min and the supernatant run through a 0.22 μm filter. Presence of concentrated NHLRC3 was verified by western blot. The filtered supernatant was concentrated 35-fold and used directly in cell culture assays.
RNA extraction and quantitative PCR
Total RNA was isolated using TRIzol® (Life Technologies) and RNeasy Kit (QIAGEN) according to the manufacturer’s protocol. cDNA was made with Superscript VILO (Life Technologies). All primers (Supplementary Table 2) were tested for efficiency and single products confirmed. qPCR analyses were performed on the Light Cycler 480II (Roche).
Lentiviral expression and viral production
pLKO vectors were a gift of Alejandro Sweet-Cordero. N103 vector was a gift of Howard Y. Chang. Sequence verified constructs were used to produce lentivirus using pRSV (Addgene plasmid #12253), pMD2.G (Addgene plasmid #12259), and pMDLg/pRRE (Addgene plasmid #12251). 293Ts were maintained in G418 (Sigma-Aldrich G8168). Plasmids were transfected using Lipofectamine 2000 following manufacturer’s protocol (Thermo-Fisher Scientific 11668). After 12 h, media was changed to viral production media (DMEM, 10% FBS, 1% P/S, 20 mM HEPES). After 48 h, media was collected, spun to remove cell debris, and lentiviral-containing supernatant was added to Lenti-X™ Concentrator (CloneTech). Following incubation at 4 ˚C for >4 h, the mixture was spun at 500 × g for 45 min, the pellet resuspended in mESC media, and frozen at −80 ˚C.
Lentiviral transduction of mESCs
mESCs were plated at a density of 30,000/cm2. After 12 h, virus was added with 6 µg/mL polybrene. Media was changed 12 h later and puromycin selection began 48 h post-transduction.
Cell culture for teratoma formation assay
J1 mESCs were cultured in serum-free media (as described above) with addition of 1000 U/mL mouse LIF, 1 µM PD0325901, and 3 µM CHIR99021, or 0.5 mg/mL Royalactin for three passages. Cells were grown in suspension on Corning Costar Ultra-Low attachment plates (Sigma-Aldrich). Wells were seeded in duplicate at a cell density of 100,000/well, and media and protein were changed daily. To split, colonies were first pelleted by centrifugation at 1500 × g, suspended in TrypLE (Life Technologies), incubated at 37 °C for 5 min, diluted with PBS (Life Technologies) and pelleted. Cells were counted and re-seeded at a density of 100,000/mL.
Teratoma generation and histopathology
All animal studies were conducted in accordance with Stanford University animal use guidelines and were approved by the Administrative Panel on Laboratory Animal Care (APLAC). J1 mESCs were mixed with Matrigel (BD 356237) prior to being subcutaneously injected into 8-week-old female SCID/Beige mice (Charles River) on each flank. Four weeks after injection, the mice were euthanized and the teratomas were harvested. All animal studies were approved by Stanford University IACUC guidelines. For histological analysis, slides were stained with hematoxylin and eosin (H&E) following manufacturer’s instructions. Analyses were performed by a board-certified veterinary pathologist.
CGR 8.8 mESCs were grown in serum-free media (as described above) with addition of 1000 U/mL mouse LIF, 1 µM PD0325901, and 3 µM CHIR99021 (2i + LIF), 0i + Royalactin, or 0i + NHLRC3 (as noted) for ten passages. Media was changed daily. Cells were passaged using Accutase (Stemcell Technologies 07920) and suspended in M2 media for injection.
Protein extraction and western blot analysis
Cellular extracts were prepared using lysis buffer containing 50 mM Tris HCL (pH 7.5), 250 mM NaCl, 1% NP-40, 0.5% Na-deoxycholate, 0.1% SDS, 1 mM phenylmethylsulfonyl fluoride, Halt™ Protease, and Phosphatase Inhibitor Cocktail (ThermoFisher Scientific). Extracts were run on a 4–12% Bis-Tris gel (Novex) and transferred onto PVDF membranes. Blots were blocked in 5% milk PBS-T (TBS for phospho-specific) for 1 h at room temperature followed by overnight incubation at 4 ˚C with primary antibodies. HRP-conjugated secondary antibodies were used at 1:10,000. Antibodies used in this study include Nanog (ReproCELL, RCAB001P, 1:1000), Klf2 (Millipore, 09–280, 1:1000), Esrrb (Perseus Proteomics, PP-H6705–00, 1:500), Stat3 (Cell Signaling, 124H6, 1:1000), pStat3 Y705 (Cell Signaling, D3A7, 1:1000), Sox2 (Santa Cruz, sc-17320, 1:250), Tfcp2l1 (R&D, AF5726, 1:250), NHLRC3 (OriGene, TA336106, 1:500), anti-HA (Cell Signaling, C29F4, 1:1000), and Actin-HRP (Santa Cruz, sc-1616, 1:2500). Uncropped scans of the most important blots are included in Supplementary Figure 5.
RNA-seq library construction
RNA was extracted using TRIzol and purified on column with the RNeasy Mini Kit (Qiagen). Ribosomal RNA was depleted with the Ribo-Zero Gold rRNA Removal Kit (Illumina). RNA was lyophilized, suspended in 10 μL of water and fragmented to an average size of 200 base pairs using the Ambion RNA Fragmentation Kit (AM8740), and purified using Zymo clean and concentrator 5 columns. 3′ Phosphorylation, adapter ligation, reverse transcription, immunoprecipitation, circularization, amplification, and PAGE separation were performed using the FAST-iCLIP library construction method as previously described28. The quality of the libraries, including size distribution and molarity, was assessed on a BioAnalyzer High Sensitivity DNA chip (Agilent). The libraries were then multiplexed and sent for sequencing on an Illumina NextSeq 400 High Output machine for 1 × 75 cycles. Sequencing data deposited under GEO GSE81799.
ATAC-seq was performed on 50,000 J1 mESCs20. Cells were grown in serum mESC media (as described above) in serum/+LIF, serum/–LIF, or serum/–LIF + Royalactin for ten passages. Cells were washed with PBS (Life Technologies), trypsinized with Trypsin-EDTA (0.25%), quenched with serum mESC media, washed with PBS, before nuclear isolation with NP-40. Nuclei were resuspended in a transposase reaction mix containing 25µL 2× TD buffer, 2.5 µL Transposase (Illumina) and 22.5 µL of nuclease free water with sequencing adapters. Final libraries were purified on column using the QIAquick PCR Purification Kit (Qiagen) per the manufacturer’s protocol as well as with Agencourt AMPure XP beads (Beckman Coulter) to remove any remaining free adapters. The quality of the libraries, including size distribution and molarity, was assessed on a BioAnalyzer High Sensitivity DNA chip. Libraries were then multiplexed and sent for deep sequencing on the Illumina HiSeq 2500 machine for 2 × 50 cycles. Sequencing data deposited under GEO GSE81799.
RNA-seq data analysis
Reads were aligned to the mouse reference genome (build mm9) using Tophat. A maximum of a default 2 mismatches was allowed for read alignment. Gene counts were calculated using the HTSeq-count utility29 and used as an input for differential gene expression analysis with DESeq version 1.20.030. Genes with a p-value of 0.05, as well as a fold change of 2 were selected for further analysis. Validation of top differentially regulated genes was performed with quantitative reverse transcription polymerase chain reaction. Further network analysis on differentially significant genes was performed using NetworkAnalyst31. For RNA-seq analysis, GO terms were obtained using DAVID and its default parameters.
PCA analysis was performed using samples as indicated. The genes that led to the maximum amount of variance (PC1) were selected and GO terms obtained using the GO Consortium. Samples from different libraries were normalized using shifted log of normalized counts using DESeq. The ‘plotPCA’ function, which is a part of DESeq2, was used to construct the PCA plots.
ATAC-seq data analysis
Reads were aligned to the mouse reference genome (build mm9) using Bowtie. The ATAC-seq regions were divided into separate analyses: correlation with closest TSS, correlation with 5356 traditional enhancer regions present in the mm9 genome, and correlation with 361 super-enhancer regions discovered for the mm9 genome28. The ATAC-seq signals for serum/+LIF, serum/–LIF, and serum/–LIF + Royalactin after ten passages were compared using DESeq, and the results are represented in the heatmaps. The heatmaps for TSS regions, traditional enhancers, super-enhancers, and differential peaks were produced using unsupervised clustering methods, which used the normalized signal values obtained by quantile normalization, to extract transitions between two states: upregulated and downregulated. The differential peaks between serum/–LIF + Royalactin and serum/–LIF were used for correlation with RNA-seq. The peaks were filtered on the basis of a p-value threshold of 0.05 as well as fold change. Boxplots were produced using the ‘BOXPLOT’ function in R. p-value was calculated using the student’s t-test.
GO terms for peaks differentially expressed in serum/–LIF and serum/–LIF + Royalactin was performed using GREAT. The significant GO terms were filtered to only include GO terms associated with pluripotency and GO terms associated with metabolism. For pluripotency related GO terms, biological processes including morphogenesis, development, proliferation, and stem cell processes were analyzed. For metabolic GO terms, biological processes that were related to metabolism and biosynthetic processes were chosen. Motif analysis for the differential peak lists was performed using HOMER with all differential peaks used as background.
Structural modeling and Royalactin analog identification
As implemented at the MPI Toolkit (http://toolkit.tuebingen.mpg.de), HHPRED enables sensitive searches of sequence and structural databases through the assembly of profile Hidden Markov Models (HMMs) from a seed sequence, with multiple iterations of Hhblits (a more sensitive and faster program than PSI-BLAST) and PSIPRED (a very accurate secondary structure prediction program)24. The detection of a six-bladed β-propeller fold for Royalactin from the top salivary gland protein (PDB ID: 3Q6K) HHPRED hit was accompanied by a significant score of 177.4, an E-value of 6e−28, and a 28% amino acid identity from the structure-guided overlap of the mature, 413 residue honeybee Royalactin, with the 381 amino acid sand fly SGP. This structural alignment was used by MODELLER27 to guide a secure template-guided three-dimensional model of Royalactin (with a VERIFY3D score of 119.7)32, and also to nucleate a more sensitive search by HHPRED for a structurally analogous protein (to the greater Royalactin/SGP family) in the human and mouse proteomes. This latter screen, filtered by the signal peptide, single domain, and six-bladed β-propeller fold constraints common to Royalactin and SGPs, yielded NHLRC3 (Uniprot IDs: Q5JS37 and Q8CCH2, for the human and mouse orthologs, respectively) as the sole analog candidate. A three-dimensional model of the NHLRC3 β-propeller was then built by MODELLER on the best six-bladed β-propeller template recognized by HHPRED, Peptidyl-alpha-hydroxyglycine alpha-Amidating Lyase (PDB ID: 3FVZ; at a significant score of 125.6, E value of 1.7e-18, and amino acid identity of 24%). The comparison and visualization of Royalactin and NHLRC3 structural models were in turn performed by PyMOL (www.pymol.org). We recognize that β-propeller folds in general (irrespective of ‘blade’ number) are consistently used as interaction scaffolds and preferred binding platforms in the cell33. The structural models are available upon request.