HLAscan
发现这个软件之前的官网已经打不开,但是在github上仍然在更新,https://github.com/SyntekabioTools/HLAscan或许是换了工作?最近一次更新是2019.12.4,还是比较新的。发现wegene的NGS HLA分型报告是用的这个软件的参考文献,估计还是权威些的。
软件使用方法也有了一些变化,之前只是一个脚本,现在直接编译成了一个独立的可执行文件,运行效率应该也有很大的提高。也省去安装的繁琐。AMD YES的4700U也能跑得动,不错!
安装和运行
# 下载软件
wget https://github.com/SyntekabioTools/HLAscan/releases/download/v2.1.4/hla_scan_r_v2.1.4
wget https://github.com/SyntekabioTools/HLAscan/releases/download/v2.0.0/dataset.zip
#解压
unzip dataset.zip
#循环运行分型数据
for gene in 'HLA-A HLA-B HLA-C HLA-E HLA-F HLA-G
MICA MICB HLA-DMA HLA-DMB HLA-DOA HLA-DOB HLA-DPA1 HLA-DPB1 HLA-DQA1 HLA-DQB1 HLA-DRA HLA-DRB1 HLA-DRB5 TAP1 TAP2'
do
./hla_scan_r_v2.1.4 -l ../read_1.fq -r ../read_2.fq -d db/HLA-ALL.IMGT
-t 8 -g $gene
done
结果
然后就有了结果呀。
=====================================================
HLAscan v2.1
Report created
2020. 10. 21. 12:4:15
========================================================
HLA gene : HLA-A
# of considered types : 3182
----------- HLA-Types -----------
[Type 1] 31:01:02:01 EX3_209.094_100 EX2_244.789_100 EX4_291.888_100 EX5_190.632_100
[Type 2] 03:01:01:03 EX3_166.42_100 EX2_197.259_100 EX4_250.399_100 EX5_169.726_100
=====================================================
HLAscan v2.1
Report created
2020. 10. 21. 12:13:32
========================================================
HLA gene : HLA-B
# of considered types : 3958
----------- HLA-Types -----------
[Type 1] 48:01:01 EX3_528.214_100 EX2_654.385_100 EX4_984.435_100 EX5_607.077_100
[Type 2] 15:11:01 EX3_368.938_100 EX2_464.43_100 EX4_692.304_100 EX5_423.094_100
=====================================================
HLAscan v2.1
Report created
2020. 10. 21. 12:20:25
========================================================
HLA gene : HLA-C
# of considered types : 2735
----------- HLA-Types -----------
[Type 1] 08:01:01 EX3_169.279_100 EX2_194.726_100 EX4_296.558_100 EX5_194.783_100
[Type 2] 03:03:01 EX3_167.344_100 EX2_171.144_100 EX4_266.931_100 EX5_155.4_100
=====================================================
HLAscan v2.1
Report created
2020. 10. 21. 12:20:28
========================================================
HLA gene : HLA-E
# of considered types : 17
----------- HLA-Types -----------
HLAscan cannot determine proper types
=====================================================
HLAscan v2.1
Report created
2020. 10. 21. 12:20:32
========================================================
HLA gene : HLA-F
# of considered types : 22
----------- HLA-Types -----------
HLAscan cannot determine proper types
=====================================================
HLAscan v2.1
Report created
2020. 10. 21. 12:20:40
========================================================
HLA gene : HLA-G
# of considered types : 50
----------- HLA-Types -----------
HLAscan cannot determine proper types
=====================================================
HLAscan v2.1
Report created
2020. 10. 21. 12:20:57
========================================================
HLA gene : MICA
# of considered types : 102
----------- HLA-Types -----------
HLAscan cannot determine proper types
=====================================================
HLAscan v2.1
Report created
2020. 10. 21. 12:21:3
========================================================
HLA gene : MICB
# of considered types : 41
----------- HLA-Types -----------
HLAscan cannot determine proper types
=====================================================
HLAscan v2.1
Report created
2020. 10. 21. 12:21:4
========================================================
HLA gene : HLA-DMA
# of considered types : 7
----------- HLA-Types -----------
HLAscan cannot determine proper types
=====================================================
HLAscan v2.1
Report created
2020. 10. 21. 12:21:6
========================================================
HLA gene : HLA-DMB
# of considered types : 13
----------- HLA-Types -----------
HLAscan cannot determine proper types
=====================================================
HLAscan v2.1
Report created
2020. 10. 21. 12:21:8
========================================================
HLA gene : HLA-DOA
# of considered types : 12
----------- HLA-Types -----------
HLAscan cannot determine proper types
=====================================================
HLAscan v2.1
Report created
2020. 10. 21. 12:21:10
========================================================
HLA gene : HLA-DOB
# of considered types : 13
----------- HLA-Types -----------
HLAscan cannot determine proper types
=====================================================
HLAscan v2.1
Report created
2020. 10. 21. 12:21:16
========================================================
HLA gene : HLA-DPA1
# of considered types : 40
----------- HLA-Types -----------
HLAscan cannot determine proper types
=====================================================
HLAscan v2.1
Report created
2020. 10. 21. 12:22:30
========================================================
HLA gene : HLA-DPB1
# of considered types : 550
----------- HLA-Types -----------
HLAscan cannot determine proper types
=====================================================
HLAscan v2.1
Report created
2020. 10. 21. 12:22:37
========================================================
HLA gene : HLA-DQA1
# of considered types : 54
----------- HLA-Types -----------
HLAscan cannot determine proper types
=====================================================
HLAscan v2.1
Report created
2020. 10. 21. 12:24:20
========================================================
HLA gene : HLA-DQB1
# of considered types : 806
----------- HLA-Types -----------
[Type 1] 03:03:02:01 EX2_380.615_100 EX3_638.819_100 EX4_0_0
[Type 2] 06:02:01 EX2_285.522_100 EX3_589.078_100 EX4_0_0
=====================================================
HLAscan v2.1
Report created
2020. 10. 21. 12:24:21
========================================================
HLA gene : HLA-DRA
# of considered types : 7
----------- HLA-Types -----------
HLAscan cannot determine proper types
=====================================================
HLAscan v2.1
Report created
2020. 10. 21. 12:28:15
========================================================
HLA gene : HLA-DRB1
# of considered types : 1756
----------- HLA-Types -----------
[Type 1] 09:01:02 EX2_791.144_100 EX3_672.496_100 EX4_0_0
[Type 2] 15:01:01:04 EX2_707.333_100 EX3_651.83_100 EX4_0_0
=====================================================
HLAscan v2.1
Report created
2020. 10. 21. 12:28:18
========================================================
HLA gene : HLA-DRB5
# of considered types : 21
----------- HLA-Types -----------
[Type 1] 02:06 EX2_14.9259_50 EX3_0.58156_0 EX4_0_0
[Type 2] 02:06 EX2_14.9259_50 EX3_0.58156_0 EX4_0_0
=====================================================
HLAscan v2.1
Report created
2020. 10. 21. 12:28:20
========================================================
HLA-LA
1.软件安装和数据库准备
继续conda,解决软件安装难题,也不需要挑战有些门槛的docker。
## 安装
conda install hla-la
## 数据库下载
cd ~/miniconda3/opt/hla-la/
mkdir graphs
wget http://www.well.ox.ac.uk/downloads/PRG_MHC_GRCh38_withIMGT.tar.gz
tar -xvzf PRG_MHC_GRCh38_withIMGT.tar.gz
# 数据库索引,这步要耗30G的内存。。。,我这16G ram的笔记本靠swap扛着,速度就慢了不只一点了
cd ~/miniconda3/opt/hla-la/bin/
./HLA-LA --action prepareGraph --PRG_graph_dir ../graphs/PRG_MHC_GRCh38_withIMGT
2.用起来,分型
就简单的几个参数,8核,速度也就慢慢跑了,不知道会不会报错。
HLA-LA.pl --BAM ./2hla_sorted.bam --graph PRG_MHC_GRCh38_withIMGT --sampleID 10 --maxThreads 8 --workingDir ./
然后在swap+ram达到极限的70G的时候停止运行了。
gihub上看到这个issue我有点绝望了,我的硬件达不到这水平呀!
my paired-end fastq file:
R1.fastq (250 Million reads, 150bp, ~1.2 GB)
R2.fastq (250 Million reads, 150bp, ~1.2 GB)
run HLA-LA will used about 300~400 GB RAM and ~90GB swap
Optitype
软件安装
最开始尝试使用docker,无奈悲剧的失败,发现bioconda有这个软件的,于是上conda,感觉比docker更方便呢。还有一个好处是,win10家庭版不支持docker,要想支持得修改注册表一通操作,太麻烦了。
# 下面两个命令选一就可以了
conda install -c bioconda optitype
conda install -c bioconda/label/cf201901 optitype
运行和结果
很简单的一条命令就可以了。
OptiTypePipeline.py -i read_1.fq read_2.fq --dna -v -o optutype
AMD YES的r7-4700u加持下,在近乎突破硬件极限的情况下完成了分型。
结果首先是个pdf文件,是分型结果的测序覆盖度图。
然后是一个tsv文件,分型结果,是只有ABC的结果,4位的:
A1 A2 B1 B2 C1 C2 Reads Objective
0 A*03:01 A*31:01 B*15:11 B*48:01 C*03:03 C*08:01 15556.0 15135.987999999903