Daily News Hits

Chromosome-level Haploid Assembly of Cannabis sativa L. cv. Pink Pepper

By

December 28, 2024

[[{“value”:”

Abstract

As molecular research on hemp (Cannabis sativa L.) continues to advance, there is a growing need for the accumulation of more diverse genome data and more accurate genome assemblies. In this study, we report the three-way assembly data of a cannabidiol (CBD)-rich cannabis variety, â€˜Pink Pepperâ€™ cultivar using sequencing technology: PacBio Single Molecule Real-Time (SMRT) technology, Illumina sequencing technology, and Oxford Nanopore Technology (ONT). This assembly anchors scaffolds to the ten chromosomes of hemp, and to avoid confusion with previous cannabis genetic research, the chromosomes have been labeled based on an earlier reference genome. The total assembled genome length is 770 Gbp, with a GC content of 34.09% and a repeat region accounting for 77.13% of the genome. This assembly, which incorporates the unique strengths of the three sequencing technologies, demonstrated the highest complete BUSCO scores (97.8%-99.6%) among the reported cannabis genomes, as evaluated using three different BUSCO databases. With annotations for 30,459 protein-coding genes, this dataset can serve as a valuable resource for advancing genetic research on hemp.

Background & Summary

Cannabis sativa L. is a primarily annual, dioecious or monecious herb that has been traditionally cultivated for fiber production, with a history dating back to around 8,000 BC1,2. Although many fiber-use cannabis plants are still being cultivated today, there has been a recent increase in interest in the unique chemical components of cannabis called cannabinoids, and the research and medicinal application have been growing3,4,5,6,7,8.

Generally, Î”9-tetrahydrocannabinol (Î”9-THC) and cannabidiol (CBD) are the most well-known among over 100 cannabinoids, as they are the most abundant9,10. These two components mainly exist in the form of Î”9-tetrahydrocannabinolic acid (Î”9-THCA) and cannabidiolic acid (CBDA) within the plant, and they are converted to CBD and Î”9-THC through the process of decarboxylation, in which the carboxyl group is removed upon heating and light exposure, through chemical reactions11.

These two main cannabinoids are used for different purposes. Î”9-THC, a representative drug permitted in 25 states of the USA and a few countries such as Canada, the UK, Croatia, and the Czech Republic, is often used for recreational purposes due to its psychoactive properties12,13. However, ongoing medical research is being conducted to explore its potential uses. On the other hand, CBD is reported to be effective for medical purposes such as anti-anxiety14, antioxidant and anti-inflammatory15, anticonvulsant16, and synergistic effects with anti-cancer drugs17. In the cannabis cultivation industry for medical purposes, there have been active breeding efforts to reduce Î”9-THC levels and increase CBD levels for several years18. Researchers continue to seek a better understanding of the biological and physiological characteristics of medicinal (Type III) cannabis to further advance its breeding19.

Since the completion of the initial draft genome of the marijuana strain â€˜Purple Kushâ€™ in 201120, efforts have been made to establish a comprehensive database and obtain high-quality data for genomes of various strains (TableÂ 1)21,22,23,24. The current cannabis assemblies lack consistency in terms of total assembly size, and the naming of chromosome numbers and orientations is not standardized25. Previously published chromosome-level cannabis assemblies contain at least 147 scaffolds, indicating a need for better continuity (TableÂ 1). Additionally, the average number of Nâ€™s per 100 kbp is 2,772, reflecting a very high proportion of unknown sequences. Kovalchuk et al. (2020) pointed out that the Cannabis genome assembly is incomplete, contains gaps, is poorly aligned with low resolution, and the quality of the consensus sequence obscures the accuracy of annotations26. Furthermore, such assemblies create confusion for data users in distinguishing between real genome differences and assembly errors.

Table 1 The list of assemblies of Cannabis sativa L. currently available in NCBI GenBank.

Full size table

With the increasing use of Cannabis for both agricultural and medicinal purposes, it has become essential to establish a comprehensive and high-resolution cannabis genomic database. This resource is crucial for comparative genomics, evolutionary studies, breeding improvements, and understanding the genetic regulation of key agronomic traits, such as cannabinoid production. Recently, there has been growing number of studies examining small-scale variations such as single nucleotide polymorphisms (SNPs) in specific genes, as well as mid-larger scale variations like long terminal repeats (LTRs), using genomic data. The accurate identification of variations relies on the quality of sequencing and genome assembly. Therefore, ensuring high-quality genomic data is critical for the reliable interpretation of genetic variation.

To achieve a high-precision cannabis genome assembly, we utilized three sequencing technologies: Pacbio Single Molecule, Real-Time (SMRT) sequencing, Oxford Nanopore Technologies (ONT), and Illumina high-throughput short-read sequencing to achieve high precision overlap hybrid assembly. We generated two types of 3rd generation primary reads of â€˜Pink Pepperâ€™ based on PacBio SMRT, well-established for its high accuracy27, and ONT, which is advantageous for its longer read lengths28. Then, the accuracy of the genome assembly was then increased by aligning it with the Illumina sequencing data of the same variety, resulting in a chromosome-level genome (Fig.Â 1a). The assembled genome was classified into 10 chromosomes, with a size of 770â€‰Mb. The GC content was 34.09%, N per 100 kbp was 0.69, complete Benchmarking Universal Single-Copy Orthologs (BUSCO) was 99.6% (viridiplantae_odb10), 97.8% (eudicots_odb10) and 98.6% (embryophyte_odb10). Overall repeats accounted for 77.13% of the entire genome. Based on transcriptome data from leaves, flowers, roots, and stems, and protein sets related to cannabis, 30,459 genes encoding proteins were predicted, accounting for 92.92% of the total 32,779 genes.

Fig. 1

Schematic diagram of the genome assembly of Cannabis sativa L. conducted in this study (a). The reference genome used for scaffolding was GCA_900626175.2Â of NCBI GenBankÂ database. The distribution of k-mer analysis using GenomeScope 2.0 (kmer: 19). Max k-mer coverage at 300â€‰Ã—â€‰(b), and 1,000,000â€‰Ã—â€‰(c). The blue portion in the figure represents the analyzed k-mer frequency, while the orange and yellow lines represent errors and unique sequences, respectively (b, c).

Full size image

In this data, we present the complete genome sequence of the Pink Pepper cultivar, selectively bred for high CBD production. Based on this assembled genome, we can provide more precise fundamental information for not only cannabis breeding but also studies on the biological characteristics, and plant responses through the analysis of Differentially Expressed Genes. Consequently, understanding the cannabinoid and terpene biosynthesis mechanisms in cannabis could ultimately contribute to the development and application of medical cannabis.

Methods

Cannabis variety and cultivation

The variety of C. sativa used in this study was â€˜Pink Pepper,â€™ which is a type 3 cannabis strain with a high content of CBD (open field, 11.404â€‰Â±â€‰1.117%Â·inflorescence dry weight, 3.267â€‰Â±â€‰0.335%Â·leaf dry weight). The cannabis was a cut-clone, and rooting was induced in tap water before being cultivated. To secure rooting space, a large pot (15â€‰L) was filled with bed soil (bio bed soil, Heungnong Jongmyo Co., Pyeongtaek, Korea) for cultivation. The plants were grown in a green house for 90 days (24â€‰Â±â€‰4Â°C), the photoperiod was adjusted to 18â€‰hours/day using shading curtains. Although the strain was auto-flowering, the light was adjusted to 12â€‰hours/day to activate flower differentiation and induce flower development. Throughout the entire growth cycle, the plants were irrigated with 400â€‰mL of tap water once daily.

Nucleic acid extraction

High molecular weight genomic DNA was extracted from fresh leaf tissue during the vegetative growth phase, using the cetrimonium bromide (CTAB)-based extraction method. Total RNA was extracted from three types of plant tissues: flower, leaf, and root, using the Quick-RNA MiniPrep kit (Zymo Research, Irvine, CA, USA) during the flowering stage. To preserve the integrity of the nucleic acids, the sampled plant tissues were immediately submerged in liquid nitrogen and subsequently stored at âˆ’80â€‰Â°C in a deep freezer (DAIHAN Scientific Co., Ltd., Wonju, Korea) until further analysis.

Quality control and library preparation

DNA concentration, quality, quantity, and integrity were assessed using Victor 3 fluorometry (PerkinElmer Inc., Waltham, MA, USA) and gel electrophoresis. A DNA integrity number (DIN) of seven or higher was confirmed. Quality control and normalization of the Illumina library involved quantification according to the Illumina qPCR quantification protocol guide. For nanopore sequencing, library preparation utilized a ligation sequencing kit with quantification performed using Qubit 3.0 (Thermo Fisher Scientific Inc., Waltham, MA, USA). The Pacbio library was prepared using the SMRTbell Express Template Prep Kit 2.0 (Pacific Biosciences of California Inc., Menlo Park, CA, USA).

RNA was quantified with the Agilent Technologies 2100 Bioanalyzer (Santa Clara, CA, USA), achieving a RNA integrity number (RIN) of seven or higher, indicative of high quality. RNA integrity was further verified by gel electrophoresis. mRNA purification was conducted using the TruSeq stranded mRNA kit (Illumina, San Diego, CA, USA), followed by cDNA reverse transcription for library preparation. Illumina paired-end sequencing was subsequently performed.

Sequencing and pre-processing

Using the Illumina NovaSeq. 6000 (San Diego, CA, USA), we generated paired-end read data comprising 815,329,552 reads and totaling 123 giga base pair (Gbp). To remove contaminants and adaptors, fastp v0.21.0 (https://github.com/OpenGene/fastp) and BBDuk v38.87 (kâ€‰=â€‰31, mcfâ€‰=â€‰0.5; https://sourceforge.net/projects/bbmap/) were used. The contaminant databases included viral, rRNA, human, and bacterial sequences. After quality and adaptor trimming, 94 Gbp of read data was obtained, removing 0.06% viral, 2.61% rRNA, 0.03% human, and 0.03% bacterial reads. For long-read sequencing, ONT sequencing was performed using the ONT GridION (Oxford, UK), repeated five times for high reliability. The generated long-read data had adaptors removed using Porechop (v0.2.3, https://github.com/rrwick/Porechop). A pass with a quality score of seven or higher was confirmed. The number of reads was 11,484,123, with a total base pair count of 89 Gbp and an N50 of 26,677. Using the PacBio Sequel II system (Menlo Park, CA, USA), single molecule, real-time (SMRT) sequencing was performed to generate polymerase reads. Using the SMRT Link v11.1 software with the PacBio Sequel II system, adaptors were removed, and subreads were aligned, resulting in 2,039,056 reads with a total base pair count of 21 Gbp.

De novo assembly and scaffolding

To perform statistical analysis on the basic genomic information, Jellyfish v2.2.1029 and GenomeScope 2.030 were utilized to predict the genome size of the Illumina sequence reads. The analysis was conducted with k-mer 17, 19, and 21, with the 19-mer used for the final genome size estimation. As a result, homozygosity ranged from 98.64% to 98.69%, while heterozygosity ranged from 1.31% to 1.36%. The estimated haploid genome length ranged from 776 Mbp (mega base pair) to 779 Mbp, while the repeat length ranged from 554 Mbp to 556 Mbp. The unique length was estimated from 222 Mbp to 223 Mbp (Fig.Â 1b and c).

In this data, NextDenovo v2.3.1 (https://github.com/Nextomics/NextDenovo) was used to assemble the ONT reads, and then the PacBio reads were mapped31, resulting in the generation of 130 contigs with a total length of 809 Mbp. Then, the contigs were polished using Illumina short read to generate contigs of 810 Mbp. finally filtered using PurgeHaplotigs, about 4.99% of redundant sequences were removed to derive 70 contigs (770 Mbp). Using the RagTag software (https://github.com/malonge/RagTag) with default parameters, the generated contig was mapped to the previous version of C. sativa reference genome (GCA_900626175.2)32, and the pseudomolecule of a total of 770 Mbp was created. The longest chromosome was chromosome 2 with a length of 92 Mbp, while the shortest chromosome was chromosome 8 with a length of 51 Mbp, and N50 value was 77 Mbp (TableÂ 2).

Table 2 Genome statistics during assembly and scaffolding process.

Full size table

Repeat annotation

In general, long-read sequencing techniques, such as Pacbio sequencing and ONT approaches, are advantageous for the accurate detection of repeats containing tandem repeats (TRs). These methods can relatively accurately assemble long repeats spanning genes and detect the length, nucleotide composition, and nucleotide variations of TRs33. De novo repeat families were identified using RepeatModeler software (https://github.com/Dfam-consortium/RepeatModeler), and the distribution of repeats within assembled genomic sequences was analyzed using RepeatMasker v4.1.2 software34 (https://github.com/rmhubley/RepeatMasker). To enhance readability, the distribution of repeats was categorized into DNA elements, long interspersed nuclear elements (LINEs), LTRs elements, rolling circles (RCs) elements, and short interspersed nuclear elements (SINEs). The overall repeats represented 77.13% of the cannabis assembly, which was consistent with previous research reporting high repeat levels in cannabis cultivars â€˜Purple kushâ€™ and â€˜Finolaâ€™ (73.9% and 73.3%, respectively)23.

The results indicated a slightly higher repeat content in cannabis compared to its taxonomically close relative Humulus lupulus (71.46%)35,36. Additionally, it was on the higher side compared to other plants such as Xanthoceras sorbifolium (56.39%)37, Oryza sativa (51.63% – 54.34%)38, Panax ginseng (56.9%)39, and Nicotiana tabacum (67.05%)40. The most abundant repeat regions were LTR-Gypsy retrotransposons and LTR-Copia retrotransposons, comprising 24.45% and 25.81% of the genome, respectively (TableÂ 3).

Table 3 Result of repeat annotation statistics.

Full size table

Gene annotation

Total RNA from plant tissues, including stems, leaves, roots, and flowers, was reverse-transcribed, and paired-end sequencing was performed using Illumina NovaSeq 6000. Subsequently, de novo assembly was conducted to obtain transcriptome data41. Simultaneously, an evidence dataset was constructed using protein sequences from 10 registered species on NCBI (TableÂ 4), and the first gene prediction was performed using MAKER (v3.01.03)42. Among the genes, only those with an annotation edit distance of 0.25 or lower were selected. GeneMark (v4.38)43, SNAP (v20060728)44, and AUGUSTUS (v3.3.2)45 were performed for gene prediction ab initio training.

Table 4 Used protein database of related species for evidence dataset.

Full size table

By integrating the results of the first gene prediction and the ab initio training dataset, a second gene prediction for gene model prediction was conducted. EvidenceModeler v1.1.146 was used to apply different weights to each dataset. The weights were set to 7 for GeneMark data and 10 for the others.

To predict the function of the identified genes, DIAMOND (v5.34-73.0; maximum target sequenceâ€‰=â€‰20, e-value thresholdâ€‰=â€‰1e-5)47 was used to analyze the similarity with the non-redundant protein database48 from NCBI and Araport1149 from Arabidopsis thaliana. Gene ontology (GO) analysis was conducted using BLAST2GO (v5.2.5)50, protein domains were identified using InterproScan (v5.34-73.0)51, and KEGG (Kyoto encyclopedia of genes and genomes) pathway analysis was performed using the KAAS web-tool52. Annotations were defined as follows: 30,395 (92.73%) for NCBI nr, 22,093 (67.40%) for Araport11, 21,878 (66.74%) for InterProScan, 16,464 (50.23%) for BLAST2GO, and 10,376 (31.65%) for KAAS web-tool. The data from each source were combined and complemented, resulting in 30,459 genes, which accounted for 92.92% of the total cannabis transcriptome (Fig.Â 2 and TableÂ 5).

Fig. 2

Number and percentage of annotations by different annotation methods. The data represented in the Venn diagram describes protein IDs that are shared among functional annotation tools: Araport11, annotation database of Arabidopsis thaliana; NCBInr, NCBI protein sequence database; InterProScan, InterPro protein sequence database; KAAS, Kyoto encyclopedia of genes and genomes (KEGG) protein sequence database, Blast2GO: Tool for Gene Ontology (GO) analysis and functional annotation.

Full size image

Table 5 Functional annotation statistics of software for gene prediction.

Full size table

Data Records

In the study, the raw data set generated is available in the NCBI SRA database53. Specifically, the PacBio sequencing data for the genome is deposited under accession number SRX1788736154. The ONT sequencing data is available under accession number SRX1788736055, and the Illumina data under accession number SRX1788735556. The raw mRNA data generated for genome annotation have also been registered in the NCBI SRA database, associated with the following accession numbers: SRX17887359 (stem)57, SRX17887358 (root)58, SRX17887357 (leaf)59, and SRX17887356 (flower)60.

The assembled genome can be accessed in the GenBank database61. Comprehensive gene annotation information, including gene structure, functional predictions, transcriptome and protein data set can be accessed in the Figshare database62.

Technical Validation

Plant sample validation

The DNA concentration of the leaf sample was 23.616â€‰ng/Âµl, and 100â€‰Âµl was extracted (total DNA amount: 3.262â€‰Âµg). The DIN value was determined to be 7.5, and after passing the quality check, it was used for library preparation. The RNA concentration was 107.024â€‰ng/Âµl, and 96â€‰Âµl was extracted (total RNA amount: 10.274â€‰Âµg). The RIN value was confirmed to be 8.4, and the rRNA ratio was determined to be 2.0.

The RNA concentration of the root sample was 41.937â€‰ng/Âµl, and 50â€‰Âµl was extracted (total RNA amount: 2.097â€‰Âµg). The RIN value was confirmed to be 7.7, and the rRNA ratio was determined to be 4.2.

The RNA concentration of the stem sample was 59.53â€‰ng/Âµl, and 50â€‰Âµl was extracted (total RNA amount: 0.281â€‰Âµg). The RIN value was confirmed to be 7.7, and the rRNA ratio was determined to be 8.3.

The RNA concentration of the inflorescence (flower) sample was 952.552â€‰ng/Âµl, and 50â€‰Âµl was extracted (total RNA amount: 47.628â€‰Âµg). The RIN value was confirmed to be 8.3, and the rRNA ratio was determined to be 2.7.

Comparison of read statistics and BUSCO with existing cannabis assemblies

Raw reads from the chromosome-level assembly publicly available on NCBI (Exclude reads from Abacus that are not presented in Sequencing Reads Archive (SRA)) were collected using SRA Toolkit (v3.1.1-ubuntu). Statistics were then generated using SeqKit63 (v2.8.2, Supplementary Table 1).

Among them, the Illumina NovaSeq 6000 used for this assembly produced the highest number of reads, generating 815,329,552 paired-end reads totaling 123 Gbp. This result produced 2.7 times more reads than JLâ€™s HiSeq X Ten (SRA accession: SRX6757267), which previously held the highest number of reads, with comparable read lengths. The reads produced by ONT GridION had an N50 value of 26,677 and an N60 value of 73,606, which is 1.8 times higher than the N50 value of 14,716 for cs10â€™s ERX3863365 reads, the only other reads produced using ONT. It is also 1.7 times higher than the N50 value of 16,037 for Cannbio-2â€™s PacBio Sequel reads. This indicates a higher overlap proximity of reads, potentially leading to a more contiguous assembly. The reads produced by PacBio Sequel II were evaluated with a Q20 of 98.88% and a Q30 of 97.42%, the highest values next to those of Purple Kush (SRA accession: SRX4178554). These statistics demonstrate the impact of rapidly advancing sequencing technologies on producing high-quality reads. Furthermore, they emphasize the importance of hybrid assembly in offsetting disadvantages and leveraging advantages for downstream analysis.

To compare the completedÂ chromosome-level assembly (Fig. 3) with otherÂ assemblies, the final assembly version of the chromosome-level Cannabis genomes registered in NCBI were collected20,21,24,25 (GenBank accession: GCA_025232715.1, GCA_013030365.1, GCA_003417725.2, GCA_016165845.1, GCA_000230575.5, GCA_900626175.2), and the collected genome data were validated for integrity using vdb-validate. The BUSCO (v5.2.2) analysis of NCBIâ€™s chromosome-level assemblies were conducted using the viridiplantae_odb10, eudicots_odb10, and embryophyte_odb10 databases (Jan 08, 2024 released). Among the registered chromosome-level assemblies, this assembly showed the highest complete BUSCOs% based on all three databases (Fig.Â 4a-c). Specifically, for the viridiplantae_odb10 database, the complete percentage was 99.6% (single-copy: 95.8%, duplicated: 3.8%), for the eudicots_odb10 database it was 97.8% (single-copy: 91.6%, duplicated: 6.2%), and for the embryophyte_odb10 database, it was 98.6% (single-copy: 92.7%, duplicated: 5.9%). Simultaneously, our assembly data demonstrated a high level of single-copy BUSCOs% (Fig.Â 4a-c). The treemap, which represents the relative size of the assemblies, highlights the improved continuity of our assembly. Specifically, the number of scaffolds in chromosome-level assemblies is 5,303 for Finola, 147 for Cannbio-2, 12,836 for Purple Kush, 220 for cs10 (CBDRx), 160 for Abacus, and 483 for JL, while this assembly data contains only 17 scaffolds, confirming its superior continuity (TableÂ 1 and Fig.Â 4d).

Fig. 3

Circle plot of the Cannabis sativa L. cv Pink Pepper genome assembly. From the outermost to innermost layers: Chromosome number, gene, CDS (coding sequence) frequency, mRNA frequency, and the relationship of the main cannabinoid gene. The protruding segments on the chromosomes represent unscaffolded regions. The scale indicating chromosome size is in units of Mbp (mega base pairs). CDS frequency and mRNA frequency are visualized after trimming at the 1 Mbp level. The red links connecting the center represent annotated genes involved in Î”9-THCA synthesis, while the blue lines represent annotated genes involved in CBDA synthesis (based on the description).

Full size image

Fig. 4

Assembly completeness evaluation using Benchmark Universal Single-Copy Orthologs (BUSCO) and comparison of assembly continuity using a tree map chart. The evaluations were conducted using viridiplantae_odb10 (a), eudicots_odb10 (b), and embryophyta_odb10 (c). C: complete BUSCOs (Sâ€‰+â€‰D), S: Single-copy, D: Duplicated, F: Fragmented, M: Missing. The tree map chart visualizes the continuity of the assembly (d). The GenBank accession numbers for the varieties are as follows: Pink Pepper, the assembly data from this study (GCA_029168945.1); Abacus, GCA_025232715.1; Cannbio-2, GCA_016165845.1; JL, GCA_013030365.1; cs10, GCA_900626175.2; Finola, GCA_003417725.2; Purple kush, GCA_000230575.5.

Full size image

Synteny analysis with close genetic relatives of C. sativa

Synteny comparison was conducted using protein sequences (protein.fasta) and annotation files (annotation.gff) generated from the annotation through BLASTp (v2.12.0)64 and MCScanX65. Previous studies using C. sativa genomes reported synteny comparison results with Ziziphus jujuba, which belongs to the same Rosaceae family24. In our synteny analysis using between the Pink Pepper genome assembly and the Z. jujuba reference genome (RefSeq: GCF_031755915.1), a total of 72,921 genes were identified, with 30,456 classified as collinear. This indicates that C. sativa and Z. jujuba share 41.77% synteny (Fig.Â 5a and b).

Fig. 5

Synteny analysis between the assembled Pink Pepper genome and the reference genomes of closely related species. The multicolored connecting curves between the chromosomes of the two species represent syntenic blocks, indicating conserved gene blocks between the genomes (a). The dot plots generated from the synteny data show the conserved synteny between Cannabis sativa L. and other genomes (b, c). cs1-10: Chromosome number of C. sativa formed by this assembly. zj1-12: chromosome number of the Ziziphus jujuba reference genome (RefSeq: GCF_031755915.1). hl1-10: chromosome number of the Humulus lupulus reference genome (RefSeq: GCF_963169125.1).

Full size image

We further conducted a synteny analysis using the reference genome of H. lupulus (RefSeq: GCF_963169125.1), which belongs to the Cannabaceae family, a more specific clade within Rosaceae, and shares significant genetic similarity with C. sativa. Out of the 79,354 identified genes, 55,832 were analyzed as collinear genes, revealing a high synteny of 70.36% (Fig.Â 5a and c). These results further confirm the close genetic relationship between C. sativa and H. lupulus. The synteny analysis data can be available on Figshare for further analysis and use66.

Structural comparison between cannabis genomes

To compare the genomic structure using Pink Pepper assembly data, we compared the assembly with the previous reference genome, cs10 (GCA_900626175.2). Whole genome alignment (WGA) was performed using D-GENIES (v1.5.0)67 with Minimap2 (v2.26; -fâ€‰=â€‰0.02)68 as the aligner. The dot plot, with Pink Pepper as the target (reference) and cs10 as the query, revealed significant structural variations, such as gaps, inversions, and repeats, across the chromosomes, despite being from the same species. The comparison showed 19.89% no match, 9.12% matching <25%, 57.40% matching <50%, 13.36% matching <75%, and only 0.23% maching >75% (Fig.Â 6a). Additionally, distinct structural differences and variations were identified on Chromosome 7, which contains a high density of CBDAS and THCAS (or pseudo- and fragmented) loci in both our assembly and cs1021,62 (Figs.Â 3 and 6b).

Fig. 6

Whole genome alignment (WGA) dot plot between the assembled Pink Pepper genome and cs10. The dots generated in the plot represent regions of similarity between the two genomes that have been aligned. p01-p10: Chromosome numbers of Pink Pepper, c01-c10: Chromosome numbers of cs10. The WGA excluded unscaffolded contigs (a), and the dot plot of chromosome 7, which contains loci related to cannabidiolic acid synthase (CBDAS) and Î”9-tetrahydrocannabinolic acid synthase (THCAS), shows significant structural differences despite both strains being high-CBD varieties (b). Structural variations (SVs) at the chromosome level include breakpoints, duplications, sequence differences, gaps, and jumps, and the variant count was calculated per 10 Mbp (c). The GenBank accession numbers for the varieties are as follows: Pink Pepper, the assembly data from this study (GCA_029168945.1); Abacus, GCA_025232715.1; Cannbio-2, GCA_016165845.1; JL, GCA_013030365.1; cs10, GCA_900626175.2; Finola, GCA_003417725.2; Purple kush, GCA_000230575.5.

Full size image

FigureÂ 6c presents the distribution of structural variations (SVs), categorized by chromosome and interval, using the current assembly as a reference against previously registered genomic datasets (Abacus, Cannbio2, cs10, Finola, JL, and Purple Kush). The analysis was conducted using NUCmer (v3.1; lâ€‰=â€‰40, gâ€‰=â€‰90, bâ€‰=â€‰100, câ€‰=â€‰200) and dnadiff (v1.3) with Pink Pepper assembly data as reference. Overall, the number of breakpoints was highest in JL, with 577,769 instances, while Finola exhibited the highest number of relocations (12,979) and translocations (32,620). The most frequent inversions were observed in Purple Kush, totaling 3,158, and Cannbio-2 showed the greatest number of insertions, reaching 220,308. Although visually distinct large-scale structural variations were observed in Fig.Â 6a, cs10 showed the lowest values across all SV comparisons when compared to other cultivars. This finding suggests significant structural genomic variations among cannabis cultivars bred for diverse purposes and through different ways.

These intra-species WGA results have stem from fragmented assembly, as previously suggested in cannabis genomics69. However, they could also be due to phenotypic changes induced by chemical treatments such as silver nitrate and sodium thiosulfate, aimed at inducing male flower through inhibition of ethylene synthesis70, and repeated inbreeding for strain stabilization71. Additionally, these results may be influenced by inbreeding within a limited population to achieve desired chemotypes or phenotypes. Through these differences, the accumulation of multiple high-quality cannabis genome assemblies can significantly enhance the resolution of molecular phylogenetic analyses, enabling the identification of subtle differences in evolutionary relationships and precise elucidation of phylogenetic dynamics. This SVs data can be available on Figshare for further analysis and use72.

Usage Notes

TableÂ 6 provides a summary of the chromosome labels for easier data accessibility.

Table 6 Chromosome and annotation label of Cannabis sativa L. assembly of this data.

Full size table

Code availability

Parameters not mentioned in the main text, excluding threads, were set to default values. No custom code was generated for this work.

References

Shahzad, A. Hemp fiber and its compositesâ€“a review. Journal of composite materials 46, 973â€“986 (2012).

ArticleÂ
ADSÂ
CASÂ
MATHÂ

Google ScholarÂ

Attia, Z., Pogoda, C. S., Vergara, D. & Kane, N. C. Variation in mtDNA haplotypes suggests a complex history of reproductive strategy in Cannabis sativa. bioRxiv 2020â€“12 (2020).

Leelawat, S. et al. Anticancer activity of Î”9-tetrahydrocannabinol and cannabinol in vitro and in human lung cancer xenograft. Asian Pacific Journal of Tropical Biomedicine 12, 323â€“332 (2022).

ArticleÂ
CASÂ

Google ScholarÂ

Blaskovich, M. A. et al. The antimicrobial potential of cannabidiol. Communications biology 4, 1â€“18 (2021).

ArticleÂ

Google ScholarÂ

Seltzer, E. S., Watters, A. K., MacKenzie, D., Granat, L. M. & Zhang, D. Cannabidiol (CBD) as a promising anti-cancer drug. Cancers 12, 3203 (2020).

ArticleÂ
CASÂ
PubMedÂ
PubMed CentralÂ

Google ScholarÂ

Gaston, T. E. & Friedman, D. Pharmacology of cannabinoids in the treatment of epilepsy. Epilepsy & Behavior 70, 313â€“318 (2017).

ArticleÂ
MATHÂ

Google ScholarÂ

Thapa, D. et al. The cannabinoids Î”8THC, CBD, and HU-308 act via distinct receptors to reduce corneal pain and inflammation. Cannabis and cannabinoid research 3, 11â€“20 (2018).

ArticleÂ
CASÂ
PubMedÂ
PubMed CentralÂ
MATHÂ

Google ScholarÂ

Billakota, S., Devinsky, O. & Marsh, E. Cannabinoid therapy in epilepsy. Current opinion in neurology 32, 220â€“226 (2019).

ArticleÂ
CASÂ
PubMedÂ
MATHÂ

Google ScholarÂ

Radwan, M. M. et al. Isolation and pharmacological evaluation of minor cannabinoids from high-potency Cannabis sativa. Journal of natural products 78, 1271â€“1276 (2015).

ArticleÂ
CASÂ
PubMedÂ
PubMed CentralÂ
MATHÂ

Google ScholarÂ

Borille, B. T. et al. Near infrared spectroscopy combined with chemometrics for growth stage classification of cannabis cultivated in a greenhouse from seized seeds. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 173, 318â€“323 (2017).

ArticleÂ
ADSÂ
CASÂ
PubMedÂ

Google ScholarÂ

Ryu, B. R. et al. Conversion characteristics of some major cannabinoids from hemp (Cannabis sativa L.) raw materials by new rapid simultaneous analysis method. Molecules 26, 4113 (2021).

ArticleÂ
CASÂ
PubMedÂ
PubMed CentralÂ
MATHÂ

Google ScholarÂ

Malabadi, R. B., Kolkar, K. & Chalannavar, R. Medical Cannabis sativa (Marijuana or Drug type); The story of discovery of Î”9-Tetrahydrocannabinol (THC). International Journal of Innovation Scientific Research and Review 5, 4134â€“4143 (2023).

Google ScholarÂ

National Conference of State Legislatures. State Medical Cannabis Laws. https://www.ncsl.org/health/state-medical-cannabis-laws (2024).

Blessing, E. M., Steenkamp, M. M., Manzanares, J. & Marmar, C. R. Cannabidiol as a potential treatment for anxiety disorders. Neurotherapeutics 12, 825â€“836 (2015).

ArticleÂ
CASÂ
PubMedÂ
PubMed CentralÂ

Google ScholarÂ

Atalay, S., Jarocka-Karpowicz, I. & Skrzydlewska, E. Antioxidative and anti-inflammatory properties of cannabidiol. Antioxidants 9, 21 (2019).

ArticleÂ
PubMedÂ
PubMed CentralÂ

Google ScholarÂ

Jones, N. A. et al. Cannabidiol Displays Antiepileptiform and Antiseizure Properties In Vitro and In Vivo. J Pharmacol Exp Ther 332, 569â€“577 (2010).

ArticleÂ
CASÂ
PubMedÂ
PubMed CentralÂ
MATHÂ

Google ScholarÂ

The effects of cannabidiol and its synergism with bortezomib in multiple myeloma cell lines. A role for transient receptor potential vanilloid typeâ€�2 – Morelli – 2014 – International Journal of Cancer – Wiley Online Library. https://onlinelibrary.wiley.com/doi/full/10.1002/ijc.28591?casa_token=C9zXgL9ZPwQAAAAA%3AvashIza5lImcCEYUrgsnibwgFqOE_sIkZa6VFroY7yJ9MrKn90kR0cGM_EmqjGOoNdQ2rWC5Q6hzCAk.

Fulvio, F., Righetti, L., Minervini, M., Moschella, A. & Paris, R. The B1080/B1192 molecular marker identifies hemp plants with functional THCA synthase and total THC content above legal limit. Gene 858, 147198 (2023).

ArticleÂ
CASÂ
PubMedÂ

Google ScholarÂ

The characterization of key physiological traits of medicinal cannabis (Cannabis sativa L.) as a tool for precision breeding | BMC Plant Biology. https://link.springer.com/article/10.1186/s12870-021-03079-2.

van Bakel, H. et al. The draft genome and transcriptome of Cannabis sativa. Genome Biol 12, R102 (2011).

ArticleÂ
PubMedÂ
PubMed CentralÂ
MATHÂ

Google ScholarÂ

Grassa, C. J. et al. A new Cannabis genome assembly associates elevated cannabidiol (CBD) with hemp introgressed into marijuana. New Phytologist 230, 1665â€“1679 (2021).

ArticleÂ
CASÂ
PubMedÂ
MATHÂ

Google ScholarÂ

McGarvey, P. et al. De novo assembly and annotation of transcriptomes from two cultivars of Cannabis sativa with different cannabinoid profiles. Gene 762, 145026 (2020).

ArticleÂ
CASÂ
PubMedÂ
MATHÂ

Google ScholarÂ

Laverty, K. U. et al. A physical and genetic map of Cannabis sativa identifies extensive rearrangements at the THC/CBD acid synthase loci. Genome Res. 29, 146â€“156 (2019).

ArticleÂ
CASÂ
PubMedÂ
PubMed CentralÂ
MATHÂ

Google ScholarÂ

Gao, S. et al. A high-quality reference genome of wild Cannabis sativa. Horticulture Research 7, 73 (2020).

ArticleÂ
PubMedÂ
PubMed CentralÂ
MATHÂ

Google ScholarÂ

Braich, S., Baillie, R. C., Spangenberg, G. C. & Cogan, N. O. A new and improved genome sequence of Cannabis sativa. Gigabyte 2020 (2020).

Kovalchuk, I. et al. The Genomics of Cannabis and Its Close Relatives. Annual Review of Plant Biology 71, 713â€“739 (2020).

ArticleÂ
CASÂ
PubMedÂ
MATHÂ

Google ScholarÂ

Rhoads, A. & Au, K. F. PacBio Sequencing and its Applications. Genomics, Proteomics & Bioinformatics 13, 278â€“289 (2015).

ArticleÂ
MATHÂ

Google ScholarÂ

Sun, X. et al. Nanopore Sequencing and Its Clinical Applications. in Precision Medicine (ed. Huang, T.) 13â€“32, https://doi.org/10.1007/978-1-0716-0904-0_2 (Springer US, New York, NY, 2020).

MarÃ§ais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27, 764â€“770 (2011).

ArticleÂ
PubMedÂ
PubMed CentralÂ
MATHÂ

Google ScholarÂ

Ranallo-Benavidez, T. R., Jaron, K. S. & Schatz, M. C. GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nat Commun 11, 1432 (2020).

ArticleÂ
ADSÂ
CASÂ
PubMedÂ
PubMed CentralÂ

Google ScholarÂ

Zhou, Y. et al. De novo assembly of plant complete genomes. T 1, 1â€“8 (2022).

MATHÂ

Google ScholarÂ

NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:ERS2852417 (2020).

De Roeck, A. et al. NanoSatellite: accurate characterization of expanded tandem repeat length and sequence through whole genome long-read sequencing on PromethION. Genome Biol 20, 239 (2019).

ArticleÂ
PubMedÂ
PubMed CentralÂ
MATHÂ

Google ScholarÂ

Chen, N. Using RepeatMasker to Identify Repetitive Elements in Genomic Sequences. Current Protocols in Bioinformatics 5, 4.10.1â€“4.10.14 (2004).

ArticleÂ

Google ScholarÂ

Fu, X.-G. et al. Phylogenomic analysis of the hemp family (Cannabaceae) reveals deep cyto-nuclear discordance and provides new insights into generic relationships. Journal of Systematics and Evolution 61, 806â€“826 (2023).

ArticleÂ
MATHÂ

Google ScholarÂ

Padgitt-Cobb, L. K. et al. A draft phased assembly of the diploid Cascade hop (Humulus lupulus) genome. The Plant Genome 14, e20072 (2021).

ArticleÂ
CASÂ
PubMedÂ

Google ScholarÂ

Liang, Q. et al. The genome assembly and annotation of yellowhorn (Xanthoceras sorbifolium Bunge). GigaScience 8, giz071 (2019).

ArticleÂ
PubMedÂ
PubMed CentralÂ

Google ScholarÂ

Liang, J., Kong, L., Hu, X., Fu, C. & Bai, S. Chromosomal-level genome assembly of the high-quality Xian/Indica rice (Oryza sativa L.) Xiangyaxiangzhan. BMC Plant Biol 23, 94 (2023).

ArticleÂ
CASÂ
PubMedÂ
PubMed CentralÂ

Google ScholarÂ

Lee, D.-J. et al. Chromosome-Scale Genome Assembly and Triterpenoid Saponin Biosynthesis in Korean Bellflower (Platycodon grandiflorum). International Journal of Molecular Sciences 24, 6534 (2023).

ArticleÂ
CASÂ
PubMedÂ
PubMed CentralÂ
MATHÂ

Google ScholarÂ

Edwards, K. D. et al. A reference genome for Nicotiana tabacum enables map-based cloning of homeologous loci implicated in nitrogen utilization efficiency. BMC Genomics 18, 448 (2017).

ArticleÂ
CASÂ
PubMedÂ
PubMed CentralÂ

Google ScholarÂ

Li, Z. et al. RNA-Seq improves annotation of protein-coding genes in the cucumber genome. BMC Genomics 12, 540 (2011).

ArticleÂ
CASÂ
PubMedÂ
PubMed CentralÂ
MATHÂ

Google ScholarÂ

Holt, C. & Yandell, M. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 12, 491 (2011).

ArticleÂ
PubMedÂ
PubMed CentralÂ

Google ScholarÂ

Lukashin, A. V. & Borodovsky, M. GeneMark.hmm: New solutions for gene finding. Nucleic Acids Research 26, 1107â€“1115 (1998).

ArticleÂ
CASÂ
PubMedÂ
PubMed CentralÂ
MATHÂ

Google ScholarÂ

Zaharia, M. et al. Faster and More Accurate Sequence Alignment with SNAP. Preprint at https://doi.org/10.48550/arXiv.1111.5572 (2011).

Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Research 34, W435â€“W439 (2006).

ArticleÂ
CASÂ
PubMedÂ
PubMed CentralÂ
MATHÂ

Google ScholarÂ

Haas, B. J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol 9, R7 (2008).

ArticleÂ
PubMedÂ
PubMed CentralÂ
MATHÂ

Google ScholarÂ

Buchfink, B., Reuter, K. & Drost, H.-G. Sensitive protein alignments at tree-of-life scale using DIAMOND. Nat Methods 18, 366â€“368 (2021).

ArticleÂ
CASÂ
PubMedÂ
PubMed CentralÂ
MATHÂ

Google ScholarÂ

Pruitt, K. D., Tatusova, T. & Maglott, D. R. NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Research 33, D501â€“D504 (2005).

ArticleÂ
CASÂ
PubMedÂ

Google ScholarÂ

Cheng, C.-Y. et al. Araport11: a complete reannotation of the Arabidopsis thaliana reference genome. The Plant Journal 89, 789â€“804 (2017).

ArticleÂ
CASÂ
PubMedÂ
MATHÂ

Google ScholarÂ

Conesa, A. et al. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 21, 3674â€“3676 (2005).

ArticleÂ
CASÂ
PubMedÂ
MATHÂ

Google ScholarÂ

Jones, P. et al. InterProScan 5: genome-scale protein function classification. Bioinformatics 30, 1236â€“1240 (2014).

ArticleÂ
CASÂ
PubMedÂ
PubMed CentralÂ
MATHÂ

Google ScholarÂ

Moriya, Y., Itoh, M., Okuda, S., Yoshizawa, A. C. & Kanehisa, M. KAAS: an automatic genome annotation and pathway reconstruction server. Nucleic Acids Research 35, W182â€“W185 (2007).

ArticleÂ
PubMedÂ
PubMed CentralÂ

Google ScholarÂ

NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRP402544 (2023).

NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRX17887361 (2023).

NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRX17887360 (2023).

NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRX17887355 (2023).

NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRX17887359 (2023).

NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRX17887358 (2023).

NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRX17887357 (2023).

NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRX17887356 (2023).

Cannabis sativa cultivar Pink pepper isolate KNU-18-1, whole genome shotgun sequencing project. GenBank https://identifiers.org/ncbi/insdc:JAQSJK000000000 (2023).

Cannabis sativa L. (Pink pepper) Annotation Data Set. figshare https://doi.org/10.6084/m9.figshare.21391449.v6 (2024).

Shen, W., Le, S., Li, Y. & Hu, F. SeqKit: a cross-platform and ultrafast toolkit for FASTA/Q file manipulation. PloS one 11, e0163962 (2016).

ArticleÂ
PubMedÂ
PubMed CentralÂ

Google ScholarÂ

Camacho, C. et al. BLAST+: architecture and applications. BMC bioinformatics 10, 1â€“9 (2009).

ArticleÂ
MATHÂ

Google ScholarÂ

Wang, Y. et al. MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic acids research 40, e49â€“e49 (2012).

ArticleÂ
CASÂ
PubMedÂ
PubMed CentralÂ
MATHÂ

Google ScholarÂ

Cannabis sativa, L. (Pink pepper) synteny result data set. figshare https://doi.org/10.6084/m9.figshare.27196350.v1 (2024).

Cabanettes, F. & Klopp, C. D-GENIES: dot plot large genomes in an interactive, efficient and simple way. PeerJ 6, e4958 (2018).

ArticleÂ
PubMedÂ
PubMed CentralÂ

Google ScholarÂ

Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094â€“3100 (2018).

ArticleÂ
CASÂ
PubMedÂ
PubMed CentralÂ
MATHÂ

Google ScholarÂ

Hurgobin, B. et al. Recent advances in Cannabis sativa genomics research. New Phytologist 230, 73â€“89 (2021).

ArticleÂ
CASÂ
PubMedÂ

Google ScholarÂ

FlajÅ¡man, M., Slapnik, M. & Murovec, J. Production of feminized seeds of high CBD Cannabis sativa L. by manipulation of sex expression and its application to breeding. Frontiers in plant science 12, 718092 (2021).

ArticleÂ
PubMedÂ
PubMed CentralÂ

Google ScholarÂ

Barcaccia, G. et al. Potentials and Challenges of Genomics for Breeding Cannabis Cultivars. Front. Plant Sci. 11 (2020).

Cannabis sativa, L. (Pink pepper) whole genome alignment data set. figshare https://doi.org/10.6084/m9.figshare.27198693.v1 (2024).

Lewis, M. A., Russo, E. B. & Smith, K. M. Pharmacological Foundations of Cannabis Chemovars. Planta Med 84, 225â€“233 (2018).

ArticleÂ
CASÂ
PubMedÂ
MATHÂ

Google ScholarÂ

Download references

Acknowledgements

This study was supported by the Ministry of Science and ICT (MSIT, Korea) (support program: 2021-DD-UP-0379) and the BK21 FOUR program of the National Research Foundation (NRF, Korea). We thankful the National Institute of Food and Drug Safety Evaluation for granting the Narcotics Academic Researcher approval (Permit Number: Seoul-1806, Seoul, Korea) and the Narcotics Raw Material Handling Approval (Narcotics Policy Division-4789) for the collection, cultivation, analysis, and use of the plant material in this study.

Author information

Authors and Affiliations

Department of Bio-Health Convergence, Kangwon National University, Chuncheon, 24341, Republic of Korea

Byeong-Ryeol Ryu,Â Ye-Rim Shin,Â Min-Ji Kang,Â Min-Jun Kim,Â Young-Seok LimÂ &Â Jung-Dae Lim

Institute of Cannabis Research, Colorado State University-Pueblo, 2200 Bonforte Blvd, Pueblo, CO, 81001-4901, USA

Byeong-Ryeol RyuÂ &Â Sang-Hyuck Park

National Agrobiodiversity Center, National Academy of Agricultural Science, Rural Development Administration, Jeonju, 54874, Republic of Korea

Gyeong-Ju Gim

Institute of Biological Resources, Chuncheon Bioindustry Foundation, Chuncheon, 24232, Republic of Korea

Tae-Hyung Kwon

Department of Bio-Health Technology, Kangwon National University, Chuncheon, 24341, Republic of Korea

Young-Seok Lim

Department of Bio-Functional Material, Kangwon National University, Samcheok, 25949, Republic of Korea

Jung-Dae Lim

AuthorsByeong-Ryeol Ryu

View author publications

You can also search for this author in
PubMedÂ Google Scholar

Gyeong-Ju Gim

View author publications

You can also search for this author in
PubMedÂ Google Scholar

Ye-Rim Shin

View author publications

You can also search for this author in
PubMedÂ Google Scholar

Min-Ji Kang

View author publications

You can also search for this author in
PubMedÂ Google Scholar

Min-Jun Kim

View author publications

You can also search for this author in
PubMedÂ Google Scholar

Tae-Hyung Kwon

View author publications

You can also search for this author in
PubMedÂ Google Scholar

Young-Seok Lim

View author publications

You can also search for this author in
PubMedÂ Google Scholar

Sang-Hyuck Park

View author publications

You can also search for this author in
PubMedÂ Google Scholar

Jung-Dae Lim

View author publications

You can also search for this author in
PubMedÂ Google Scholar

Contributions

J.-D.L. designed and conceived the experiment. B.-R.R., G.-J.G., Y.-R.S. drafted the manuscript and visualized the data. B.-R.R., G.-J.G., S.-H.P. analyzed the data, B.-R.R., Y.-S.L. breed the plant. M.-J.Kang, M.-J.Kim, T.-H.K. propagated, and managed the plant material and G.-J.G., S.-H.P. corrected the manuscript. S.-H.P., J.-D.L. supervised the study. All authors read and approved the final version of the manuscript.

Corresponding authors

Correspondence to
Sang-Hyuck Park or Jung-Dae Lim.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisherâ€™s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the articleâ€™s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the articleâ€™s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Ryu, BR., Gim, GJ., Shin, YR. et al. Chromosome-level Haploid Assembly of Cannabis sativa L. cv. Pink Pepper.
Sci Data 11, 1442 (2024). https://doi.org/10.1038/s41597-024-04288-8

Download citation

Received: 08 July 2024

Accepted: 16 December 2024

Published: 28 December 2024

DOI: https://doi.org/10.1038/s41597-024-04288-8

“}]] Scientific Data – Chromosome-level Haploid Assembly of Cannabis sativa L. cv. Pink Pepper Read More

Author:

By

Daily News Hits

Masked men break into Deroche weed dispensary
by Mission City Record Staff on January 18, 2025
Pair escape with 'very little product' after fleeing police in truck
WATCH: Four suspects break in, rob Battle Creek marijuana dispensary
by Autumn Pitchure I News Channel 3 on January 17, 2025
Battle Creek Police are shedding light on an alarming trend in West Michigan: robberies at marijuana dispensaries.
Four arrested in connection to Battle Creek marijuana dispensary break-in
by Liz Shepard on January 17, 2025
Police said a large amount of stolen marijuana was recovered.
Florida marijuana company appeals judge's decision to ban pot dispensaries at gas stations
by News Service of Florida on January 17, 2025
The pending petition, in part, questioned state health officials’ position that locating a dispensary adjacent to a convenience store would increase risks of crime.
Up in smoke? Wells dispensary owner's push for recreational marijuana faces pushback
by Shawn P. Sullivan on January 17, 2025
“I’m just asking, ultimately, if we can let the voters decide,” said Joel Pepin, who wants to open a recreational marijuana store in Wells.
Plan to open first marijuana dispensary in Batavia nears approval
by Susan Sarkauskas on January 16, 2025
Plans for a marijuana dispensary to open in a former Arby’s restaurant in Batavia are moving forward. The city council will vote on a request for a conditional use permit Tuesday for Dutchess Cannab...
Weed dispensary to open inside of iconic venue
by Cara Wietstock on January 15, 2025
RISE at Salt Shed will sell exclusive merch (think hats and tees) alongside hemp-derived delta-9 products.
Columbia cannabis shop employees sue city, SLED chief for false imprisonment
by Javon L. Harris on January 15, 2025
The case stems from a 2023 raid, where CPD and SLED officials seized 15 to 20 pounds of “green, plant-like material” they “perceived” to be marijuana.
Arrest made in Northeast Jackson cannabis dispensary break-in
by Brooklyn Joyner on January 15, 2025
One of two people wanted for Jackson cannabis dispensary break-in has been arrested
A breakdown of recreational marijuana products in Ohio, 1 year after legalization
by Haadiza Ogwude,Haley BeMiller on January 15, 2025
If you're new to marijuana, here's what you need to know before you swing by a dispensary in Ohio.
200 pounds of pot seized from unlicensed Detroit dispensary caught selling marijuana to minors
by FOX 2 Detroit on January 13, 2025
Police also seized guns, mushrooms, and cash.
“THC overdose” lands dispensary in legal trouble
by Cara Wietstock on January 13, 2025
The THC overdose lawsuit claims that vomiting, shortness of breath, and more landed the plaintiff in the emergency room.
Michigan recreational cannabis market hits $3B milestone as growth slows, prices drop
by Steve Neavling on January 13, 2025
Since adult-use dispensaries were legalized in late 2019, Michigan businesses rang up more than $10 billion in sales
Springfield considering loosening ordinance for medical cannabis
by Jeanette DeForge | jdeforge@repub.com on January 12, 2025
There is just one medical marijuana dispensary in the city.
Marijuana dispensaries sue NY over alleged misstep in cannabis rollout
by David Robinson on January 10, 2025
Marijuana dispensary owners sued NY cannabis regulators for allegedly illegally approving legal pot shops within 1,000 feet of existing shops.
Licensed NYC pot shops sue state in turf war over competition: 'Rug...
by Carl Campanile on January 10, 2025
Miami's First Medical Marijuana Dispensary Finally Opens
by Naomi Feinstein on January 10, 2025
The City of Miami dragged its feet in allowing dispensaries since the passage of medical marijuana in 2016.
Normalizing Cannabis: Couple eases fears of cannabis with WeedWorks
by Christina Fuoco-Karasinski on January 10, 2025
When Randi Sobel’s mother was diagnosed with MS, she would do anything to help her.
Minnesota marijuana: Dates to remember ahead of dispensaries opening
by Nick Longworth on January 9, 2025
Those looking to buy weed at a dispensary will have to wait well into 2025 to do so, as the OCM has provided an updated timeline of its ongoing licensing roll-out for businesses looking to sell.
Rapper Lou Gram launches Fly Shifter Cannabis on Eight Mile
by Steve Neavling on January 9, 2025
The rapper-turned-entrepreneur aims to create a space that embodies Detroit’s unique spirit

Maryland governor names new cannabis czar
by John Schroyer on January 22, 2025 at 9:49 pm
Tabatha Robinson most recently served as the executive deputy director of economic development for the New York Office of Cannabis Management. The post Maryland governor names new cannabis czar appeared first on Green Market Report.
Industry remains cautious about DEA chief due to past statements
by Debra Borchardt on January 22, 2025 at 9:36 pm
While President Trump signaled his support for legalization, his administration picks haven't been pro-cannabis. The post Industry remains cautious about DEA chief due to past statements appeared first on Green Market Report.
Ispire Technology launches $10M stock buyback
by Adam Jackson on January 22, 2025 at 7:12 pm
Wider vaping expansion and adoption is yielding stronger margins for the company. The post Ispire Technology launches $10M stock buyback appeared first on Green Market Report.
Vermont cannabis regulators recommend new license types
by Adam Jackson on January 22, 2025 at 4:47 pm
The annual report from regulators outlines a vision for delivery and events, as well as on-site consumption. The post Vermont cannabis regulators recommend new license types appeared first on Green Market Report.
Xebra Brands plans $150K financing for Mexico expansion push
by Adam Jackson on January 22, 2025 at 2:30 pm
The company plans to use the proceeds to ramp up production of its Elements CBD line. The post Xebra Brands plans $150K financing for Mexico expansion push appeared first on Green Market Report.
Is Pennsylvania back in play?
by Debra Borchardt on January 22, 2025 at 5:17 am
A projected budget shortfall and supporters from both sides of the aisle promise possibility in 2025. The post Is Pennsylvania back in play? appeared first on Green Market Report.
Trump’s new DEA chief not a supporter of cannabis rescheduling
by John Schroyer on January 21, 2025 at 10:56 pm
Maltz has a history of vocal opposition to changes in cannabis regulations. The post Trump’s new DEA chief not a supporter of cannabis rescheduling appeared first on Green Market Report.
Report: Trump AG pick silent on cannabis reform
by John Schroyer on January 21, 2025 at 10:14 pm
Bondi said she would "give the matter careful consideration" if confirmed but did not elaborate further. The post Report: Trump AG pick silent on cannabis reform appeared first on Green Market Report.
Blüm Holdings raises $900K, eyes three new California deals
by Adam Jackson on January 21, 2025 at 6:20 pm
The retailer is pushing forward with ambitious expansion plans in the state. The post Blüm Holdings raises $900K, eyes three new California deals appeared first on Green Market Report.
DEA judge sends marijuana rescheduling dispute to administrator
by Adam Jackson on January 21, 2025 at 4:15 pm
The DEA's agency leadership remains in flux with the change in administrations. The post DEA judge sends marijuana rescheduling dispute to administrator appeared first on Green Market Report.
Former DEA Special Agent Derek S. Maltz Appointed Acting DEA Administrator
on January 21, 2025 at 12:48 pm
The appointee said in May that if science supports cannabis rescheduling ‘then so be it,’ but that it’s ‘crystal clear’ that ‘the Justice Department hijacked the rescheduling process, placing politics above public safety.’
Standard Wellness Secures $10 Million Credit Facility at Market-Leading 9.25% Rate to Refinance Debt and Drive Growth
on January 21, 2025 at 8:41 am
null
How to Germinate Old Cannabis Seeds
on January 20, 2025 at 10:54 am
Kenneth Morrow’s exhaustive–and old–cannabis genetic collection has him wondering: Is there an optimal way to germinate decades-old seeds?
A 2024-2025 Look Back and Ahead: ‘Excited to Put 2024 in the Rearview Mirror’
on January 20, 2025 at 9:45 am
In this special feature, nine cannabis industry leaders from plant-touching companies reflect on 2024 and share what they are looking forward to and predicting for the year ahead, from their businesses to state, national and global cannabis markets.
State Officials Continue Treating Cannabis Businesses as 'Piggy Banks' in 2025
on January 17, 2025 at 5:39 pm
California, Maryland, Maine and Mississippi are entertaining cannabis excise tax hikes this year that could put licensed businesses in jeopardy.
Leafly Delisted From Nasdaq
on January 17, 2025 at 1:41 pm
The company received notice of its delisting and a transition to an over-the-counter market. Also, the company’s convertible debt maturity was extended.
President Joe Biden Commutes Sentences for Nearly 2,500 Nonviolent Drug Offenders
on January 17, 2025 at 1:27 pm
The latest act means Biden has granted more individual commutations than any other president.
South Carolina Medical Cannabis Legalization Back on the Legislative Table for 2025
on January 17, 2025 at 8:40 am
State Sen. Tom Davis reintroduced the Compassionate Care Act to end the state’s prohibition of medical cannabis.
Florida Adult-Use Cannabis Legalization Sponsor Launches Campaign for 2026 Ballot
on January 16, 2025 at 4:32 pm
Smart & Safe Florida, the political committee behind 2024’s failed measure, filed a new proposal for a constitutional amendment with minor tweaks.
National Cannabis Roundtable and US Cannabis Council Merge to Represent Industry Interests in Washington
on January 16, 2025 at 2:19 pm
The two largest industry groups agreed to combine their policy advocacy efforts to form the US Cannabis Roundtable as a leading voice for reform.
Trulieve Opening 4th Ohio Cannabis Dispensary in Zanesville
on January 16, 2025 at 1:20 pm
The company’s new retail facility in Muskingum County will host a grand opening celebration on Jan. 17.
DEA Judge Sends Interlocutory Appeal to Administrator; Recommends All Participants Be Included
on January 16, 2025 at 8:43 am
The judge asked DEA Administrator Anne Milgram to include all designated participants for the cannabis rescheduling hearing to be part of a briefing schedule.
Eaze Closes Asset Purchase, Plants 1st Cannabis Crop at Green Dragon Facility in Florida
on January 15, 2025 at 1:48 pm
The expansion will increase the indoor flowering canopy from 32,000 square feet to 64,000 square feet.
Village Farms International Comments on Delayed Cannabis Rescheduling Process
on January 15, 2025 at 1:19 pm
The hearing’s designated participant views the delay as an imperative administrative step and symbolic win for the industry against a ‘conflicted’ DEA.
Missouri Now 5th Largest Adult-Use Cannabis Market in Nation
on January 15, 2025 at 8:00 am
The Show-Me state sold more cannabis than Colorado and Arizona in 2024.

Cannabis #Cannabis generated by RSS.app

Missouri cannabis trade group offers compromise in fight over regulating intoxicating hemp
by AOL Staff on January 22, 2025
Arguably the biggest opponent of unregulated intoxicating hemp products said it’s willing to concede to allowing hemp-THC drinks to continue to be sold in grocery and liquor stores. “We are working through and very excited about the possibility of developing a carve out for low-THC hemp drinks,” said Tom Robbins, a lobbyist for the Missouri
Gov. Moore Announces Appointment Of Maryland Cannabis Administration Director
by Governor Wes Moore on January 22, 2025
ANNAPOLIS, Md. – Governor Wes Moore today announced the appointment of Tabatha Robinson as director of the Maryland Cannabis Administration. Robinson will succeed former director Will Tilburg, who departed the agency in December 2024. She will serve in an acting capacity pending confirmation of her appointment by the Maryland Senate during the 2025 Legislative Session. “In Maryland,
DOJ accuses former Chicago cannabis executive of insider trading
by By Mike Heuer on January 22, 2025
Former Verano Executive Vice President Anthony Marsico allegedly used and shared insider information to buy shares in a Minnesota cannabis producer prior to its planned takeover by Verano in 2022.
Former Georgia great Champ Bailey shares how cannabis helped him with recovery, play through pain
by Connor Riley on January 22, 2025
Champ Bailey is one of the best football players in the history of the game.
Culture Council: This Is New York Cannabis
by Peter Su on January 22, 2025
As legalization sweeps the nation, each state is defining its own cannabis culture, but nowhere does it feel more quintessentially itself than in New York.
Harford County Council opposes bill allowing liquor stores, cannabis dispensaries in residential areas
by Christian Olaniran on January 22, 2025
UGA legend Champ Bailey advocates for cannabis over ‘old-school methods’ for recovery purposes
by null on January 22, 2025
Bailey stated he never used cannabis while he was at Georgia, or in high school, noting he didn’t begin to use it for recovery purposes until he got to the NFL.
Austin City Council approves cannabis ordinance
by Eric Johnson on January 22, 2025
During its meeting Tuesday night, the Austin City Council approved the city’s cannabis ordinance. The council voted 6-1 to prepare and adopt the ordinance, with At-Large council member Jeff Austin voting against the ordinance, which passes without a buffer requirement; one of the reasons the ordinance was pushed forward to January rather than be passed

By

Abstract

Background & Summary

Methods

Cannabis variety and cultivation

Nucleic acid extraction

Quality control and library preparation

Sequencing and pre-processing

De novo assembly and scaffolding

Repeat annotation

Gene annotation

Data Records

Technical Validation

Plant sample validation

Comparison of read statistics and BUSCO with existing cannabis assemblies

Synteny analysis with close genetic relatives of C. sativa

Structural comparison between cannabis genomes

Usage Notes

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Author:

By

Related Post

You missed