使用gitlab搭建代码仓库
我来之前,公司一直用windows server搭建的svn托管代码,每次都要手动远程登陆到服务器然后后台授权。我看网上很多类github的平台,于是选了gitlab实验推荐大家迁移到git上来。
1. 搭建脚本很简单,下载安装包,启动即可。
1 | curl -O https://downloads-packages.s3.amazonaws.com/centos-6.6/gitlab-7.6.1_omnibus.5.3.0.ci.1-1.el6.x86_64.rpm |
然后按说明配置一下gitlab.rb,启动服务即可。注意8080端口和ssh端口转发。
2. 用docker来更新最新版本的gitlab
Updated 2016-03-24
1 | ## 修改防火墙 |
rChart and morris.js
1. Using rChart
and morris.js
for time series visualization
Here are the codes!
library('rCharts', 'ramnathv')
df <- read.csv("type.1h.csv",header=FALSE,stringsAsFactors=FALSE)
colnames(df) <- c("date","1", "2","3","4","5","6","7","8","9","10")
transform(df, date = as.character(date))
m1 <- mPlot(x = "date", y = c("1", "2","3","4","5","6","7","8","9","10"), type = "Line", data = df)
m1$set(pointSize = 0, lineWidth = 1)
m1$print("chart2")
m1
#base64enc
# install.packages("base64enc")
library("base64enc")
m1$save('graph1.html', 'inline', cdn=TRUE)
#m1$save('graph1.html', 'inline', standalone=TRUE)
2. Graph
Here, we can see this is a test graph!
Here is a test image!
Here is a test page!
graph1.html
服务器重装centOS设定静态ip
电信机房5台托管服务器被攻击,为安全起见,老板令我需要重装系统。可采用光盘刻录或u盘安装方法(过程从略)。
1. 重装系统
1 | reboot(crtl+allt+delete) |
2. 设置固定IP上网
1 | #### 内网或外网ip |
3. 参考
http://blog.51yip.com/linux/1120.html
https://github.com/iofdata/DM/issues/8
Sysbench for MySQL Testing
1. For whole testing scripts
host=localhost
port=3306
socket=/home/data/mysql/mysql.sock
user=root
password=123456
resultsdir=./results-thread
threads="8 16 32 64 128"
sizes="1000000 5000000 10000000 15000000 20000000 25000000 30000000"
printf "sizes,threads,transactions,trns p/s,deadlocks, dls p/s,read/write requests,r/w reqs p/s,min,avg,max,99 percentile \n" >> stat.txt
mkdir -p $resultsdir
for thread in $threads;do
mkdir $resultsdir/thread-$thread
for size in $sizes; do
sysbench --test=oltp --mysql-table-engine=innodb \
--oltp-table-size=$size --mysql-socket=$socket \
--mysql-user=$user --mysql-host=$host \
--mysql-password=$password --mysql-db=students \
--oltp-table-name=test$size prepare;
sysbench --test=oltp --mysql-table-engine=innodb \
--oltp-table-size=$size --mysql-socket=$socket \
--mysql-user=$user --mysql-host=$host \
--mysql-password=$password --mysql-db=students \
--oltp-table-name=test$size --max-requests=1000 \
--num-threads=$thread run | \
tee -a $resultsdir/thread-$thread/sysbench.$thread.$size.report;
sysbench --test=oltp --mysql-host=$host --mysql-user=$user \
--mysql-password=$password --mysql-socket=$socket \
--mysql-db=students --oltp-table-name=test$size cleanup;
cat $resultsdir/thread-$thread/sysbench.$thread.$size.report | \
egrep "cat|threads:|transactions:|deadlocks|
read/write|min:|avg:|max:|percentile:" | \
sed -e '1 s/Number of threads: //' | \
tr -d "\n" | \
sed -e 's/Number of threads: /\n/g' \
-e 's/[A-Za-z\/]\{1,\}://g' \
-e 's/read\/write//g' \
-e 's/approx\. 95//g' \
-e 's/per sec.)//g' \
-e 's/ms//g' \
-e 's/(//g' \
-e 's/ */,/g' | awk -v d=$size '{$0=d","$0}1' >> stat.txt
done
done
PacBio Sequencing
Here is a short talk about PacBio Sequencing by me, any suggestion is welcome!
genome annotation
genome assembly
Here is a short talk about genome assembly.
And for DBG details, please click on DBG.
Here I updated the images of the slide for you.
单文库基因组组装 (A Single Library for Genome Assemble)
Illumina 报告中比较了 Reads 长度,coverage,insert size 等对组装结果的影响,可以看到理想状况下,对于简单基因组,30X左右短片段reads加上适量长片段reads可以覆盖足够的基因组区域,并且有较好的N50等指标。
最开始sanger测序可能为了避免重复序列的影响,采用了1k-40k的建库策略,后来soapdenovo在做人类基因组的时候沿用了200,500,2k,5k,10k的测序方法。但是不同基因组具体采用的策略并不一致,但是一般均需要短片段文库(<2k)和长片段文库(>2k)。像Abyss由于做非洲人的时候就只用了42X的210文库数据。 2k)和长片段文库(>
GAGE评价了一些组装软件的组装效果,有 Effect of multiple libraries on assembly 这一段。结合我自己的项目经验,multilib的策略是为了辅助scaffolding。因为contig的组装主要用到reads见的overlap信息,只要测序随机和均一,并且深度足够,短片段reads可以很好的组装出contig(无N的一致性序列),contig的组装步骤并不设计文库片段信息(insert-size和pair-end关系),后面scaffolding则需要用到文库信息来辅助contig间建立连接关系,而这里最主要的也是需要大雨2k的文库梯度分配。所以像allpath这种软件推荐的就是一个短片段文库加一个大片段文库。金小峰这种单倍体物种,基因组也不太大,考虑到个体小,提取DNA复杂,一只蜜蜂样品不足以构建三个短片段文库(200,500,800),我们可以尽量尝试建1到2个文库,对于contig组装影响不会太大(我曾经组装的单染色体蚂蚁也是由于样品原因,建了一个500的文库,效果也很好)。
另外我们注意到像fermi这样的最新的组装软件的进展,对人类基因组已经可以一个样品一个库,35X数据做denovo assembly了。
为了更好的开展后续的分析和讨论工作,后面我还会具体找下已经出来的蜜蜂或蚂蚁的组装文献给大家看看,应该说膜翅目的研究现在还是比较热门的,有很多可参考的借鉴的地方。为了尽快推进这个项目,我们没必要非建3个文库。这是我的意见。
Bayesian Genome Assembly and MCMC Assessment
Introduction
They first build an assembly graph starting from a de Bruijn graph of the reads. Then they remove all tips and merge all unambiguous paths into single nodes that are annotated by the sequence of merged K-mers.
The resulting unresolved assembly graph (no longer de Bruijn) is a directed graph that consists only of bubbles and is a minimal representation of the variants that can be inferred from the sequenced data. Concatenating the sequences across the nodes in a particular path through this graph gives a possible assembly sequence.