Yang Feng's Blog

A blog about R, Latex and many other stuff.

Auto-Removing Fields from Final Bibtex-File using jabref (ZZ)

Posted by yangfeng on October 23, 2011

Credit to Original post at http://sourceforge.net/projects/jabref/forums/forum/318825/topic/2573629

1. Add a new field in Options->’Set up general fields’ and add e.g. url_jabref to the ‘General’ line.

2. Now go to the Tools->’Set/Clear/Rename Fields’ and rename url to url_jabref.

3. Now there is no url in the bibliography of the article but the url to the article is still in your database.

Posted in Latex Tips | Leave a Comment »

Manage a local .bib file system (Simplify other tutorials)

Posted by yangfeng on October 20, 2011

create the following folder structure under  a specific folder (e.g., called “my local tex system”):

1.bibtex

1.1 bib

under bib folder, you can put the .bib file inside it

1.2 bst

2. tex

2.1 latex

 

 

When finishing, open the miktex Maintenance->Settings->Roots->Add->Choose the directory  ”my local tex system”. You are all set to go!

 

In the future, you can just use \bibliography{***.bib} inside any document on the computer.

Posted in Latex Tips | Leave a Comment »

winedt 6 \cite auto popup with typing \cite{}

Posted by yangfeng on October 20, 2011

It seems when using my own .bib file as the reference database, it will not automatically popup the list of references when typing \cite{}. A solution is to open the .bib file, and then under the menu->View->Tree, then click the first button “Build Tree”. And go back to the .tex file and you are ready to go!

Posted in Latex Tips | Leave a Comment »

Best Latex Combination in Windows (Winedt 6 + SumatraPDF)

Posted by yangfeng on March 3, 2011

Winedt 6 + SumatraPDF
1. Install Winedt 6, sumatra pdf, miktex
2. Settings in winedt 6,
Options->Executive modes->PDF viewer->Change the path to SumatraPDF.
Tick the box for sync tex.
3. If you had winedt 5.5 before, there maybe associated problem. First uninstall winedt 5.5, then
Options->Configuration Wizard->Filetype Assoications
Select the file type you would like winedt 6 to associated with.
4. Wrapping in winedt 6. Here is my configuration: (It will automatically autowrap when resizing your window which I found very convenient to use).
Options->Options Interface->Formatting:Wrapping…->Wrapping->Change the corresponding paragraph to the following:

ENABLE_WRAPPING=1
WRAPPING_FILTER=”TeX;HTML;ANSI;ASCII|DTX;INS;STY;TOC;EDT;INI”

// If you don’t want TeX Documents to be treated in Soft Mode remove TeX; from the filter!
// Soft Wrapping (like Notepad):
SOFT_WRAPPING=1
SOFT_WRAPPING_FILTER=”TeX;HTML;ANSI;Soft|DTX;INS;STY;TOC;Hard”
//SOFT_WRAPPING_FILTER=”HTML;ANSI;Soft|DTX;INS;STY;TOC;Hard”

// For those that know how to use it…
SEMISOFT_WRAPPING=0
SEMISOFT_WRAPPING_FILTER=”XS|UNIX;PC”

// Fixed Right Margin
RIGHT_MARGIN=50

// Soft Wrapping Options
// Change to FIXED_RIGHT_MARGIN=0 if you want to use the
// size of the Window to determine Right Margin
// (resizing can be annoying!)…
FIXED_RIGHT_MARGIN=1
WRAP_AND_UNWRAP=1
REFORMAT_ON_RESIZE=1 // Ignored for Fixed Right Margin!

INDENTED_SOFT_WRAPPING=1 // Indent the whole Paragraph?
SHOW_LINE_BREAKS=0
SHOW_LINE_BREAKS_UNWRAPPED=0

// Wrap Comments in Soft Mode?
WRAP_SOFT_COMMENTS=1

// Break Lines even if Wrapping is turned off?
PERSISTENT_LINE_WRAPPING=0

// Undo information:
GROUP_UNDO=0
GROUP_UNDO_SOFT=1

Posted in Latex Tips | Tagged: , , | 3 Comments »

Using SSH without password

Posted by yangfeng on September 21, 2010

Take the SSH Secure Shell Software as an example

To be more clear, three steps

1. First connect to the desired SSH server
2. Edit->Settings->Global Settings->User Authentication->Keys->Generate New->Next->RSA->Enter filename without entering comments and passphrase (Suppose the name is my_pub_key.pub)

->Click Upload Public Key->Select Destination Folder: .ssh

3. In linux server run the command
ssh-keygen -X -f ~/.ssh/my_pub_key.pub > ~/.ssh/authorized_keys

Posted in Uncategorized | Tagged: , | Leave a Comment »

The rsync Tool In Windows

Posted by yangfeng on September 21, 2010

Here is a reference: http://blog.myownserver.info/2010/04/how-to-install-cwrsync-for-windows-vista7-tutorial/#home_directory

To be more clear

0. Download the SSH-Client.exe and put it under C:\\Program Files (x86)\\
1. Download cwrsync here: http://sourceforge.net/projects/sereds/files/
2. When install cwrsync, install it under the folder of C:\cwRsync,
after installing, add “c:\cwRsync\bin” to the system PATH variable so
that rsync can be called from any location.
3. Under the C:\cwRsync Create folder “\home\yangfeng”
4.
On PC using cmd, run

$ ssh-keygen -t rsa

This will prompt for a passphrase. Just press the enter key.
It’ll then generate an identification (private key) and a public key.
Do not ever share the private key with anyone! ssh-keygen shows where
it saved the public key. This is by default ~/.ssh/id_rsa.pub:

Your public key has been saved in //home/yangfeng/.ssh/id_rsa.pub. If it is not there, copy it to the folder.

5.

Transfer the id_rsa.pub file to host_dest by either ftp, scp,
rsync or any other method. ( SSH Client)

6.

On host_dest ( SSH Client), login as the remote user which you plan to use
when you run scp, ssh or rsync on host_src.
7.

Copy the contents of id_rsa.pub to ~/.ssh/authorized_keys

$ cat id_rsa.pub >>~/.ssh/authorized_keys
$ chmod 700 ~/.ssh/authorized_keys

If this file does not exists, then the above command will create
it. Make sure you remove permission for others to read this file. If
its a public key, why prevent others from reading this file? Probably,
the owner of the key has distributed it to a few trusted users and has
not placed any additional security measures to check if its really a
trusted user.

Posted in Uncategorized | Tagged: , | Leave a Comment »

Dropbox- a must tool for sharing files

Posted by yangfeng on May 19, 2010

Register HERE:

https://www.dropbox.com/referrals/NTI5MDY3MDI5

File Sync

Dropbox allows you to sync your files online and across your computers automatically.

  • 2GB of online storage for free, with up to 100GB available to paying customers.
  • Sync files of any size or type.
  • Sync Windows, Mac and Linux computers.
  • Automatically syncs when new files or changes are detected.
  • Work on files in your Dropbox even if you’re offline. Your changes sync once your computer has an Internet connection again.
  • Dropbox transfers will correctly resume where they left off if the connection drops.
  • Efficient sync – only the pieces of a file that changed (not the whole file) are synced. This saves you time.
  • Doesn’t hog your Internet connection. You can manually set bandwidth limits.

File Sharing

Sharing files is simple and can be done with only a few clicks.

  • Shared folders allow several people to collaborate on a set of files.
  • You can see other people’s changes instantly.
  • A “Public” folder that lets you link directly to files in your Dropbox.
  • Control who is able to access shared folders (including ability to kick people out and remove the shared files from their computers).
  • Automatically create shareable online photo galleries from folders of photos in your Dropbox.

Online Backup

Dropbox backs up your files online without you having to think about it.

  • Automatic backup of your files.
  • Undelete files and folders.
  • Restore previous versions of your files.
  • 30 days of undo history, with unlimited undo available as a paid option.

Web Access

A copy of your files are stored on Dropbox’s secure servers. This lets you access them from any computer or mobile device.

  • Manipulate files as you would on your desktop – add, edit, delete, rename etc.
  • Search your entire Dropbox for files.
  • A “Recent Events” feed that shows you a summary of activity in your Dropbox.
  • Create shared folders and invite people to them.
  • Recover previous versions of any file or undelete deleted files.
  • View photo galleries created automatically from photos in your Dropbox.

Security & Privacy

Dropbox takes the security and privacy of your files very seriously.

  • Shared folders are viewable only by people you invite.
  • All transmission of file data and metadata occurs over an encrypted channel (SSL).
  • All files stored on Dropbox servers are encrypted (AES-256) and are inaccessible without your account password.
  • Dropbox website and client software have been hardened against attacks from hackers.
  • Dropbox employees are not able to view any user’s files.
  • Online access to your files requires your username and password.
  • Public files are only viewable by people who have a link to the file(s). Public folders are not browsable or searchable.

Mobile Device Access

The free Dropbox application for iPhone, iPad, and Android lets you:

  • Access your Dropbox on the go.
  • View files from within the application.
  • Download files for offline viewing.
  • Take photos and videos and sync them to your Dropbox.
  • Share links to files in your Dropbox.
  • Export your files to other applications.
  • Sync downloaded files so they’re up-to-date.

A mobile-optimized version of the website is also available for owners of Blackberry phones and other Internet-capable mobile devices.

Posted in Uncategorized | Leave a Comment »

Abbreviations of R Commands

Posted by yangfeng on May 19, 2010

Abbreviations of R Commands Explained: 250+ R Abbreviations (From http://jeromyanglim.blogspot.com/2010/05/abbreviations-of-r-commands-explained.html)
The R programming language includes many abbreviations. Abbreviations exist in function names, argument names, and allowed values for arguments. This post expands on over 150 R abbreviations with the aim of making it easier for users new to R who are trying to memorise R commands.
Context
Abbreviations save time when typing and can make for less cumbersome code. However, abbreviations often make it more difficult to remember a command. This is especially true when the user does not know what the abbreviation stands for.

R has been developed by a group of technical experts with backgrounds in Linux and Unix, mathematics, statistics, and statistical computing. With gaining popularity, R is now being used by people with little to none of this background. Abbreviations which are intuitive to the experts are not necessarily intuitive to this broader audience. The R help system does a reasonable job of explaining the abbreviations in R. However, I thought it would be useful to write a post listing some of the common abbreviations along with the expansion of the abbreviation. Whereas R sometimes errs on the side of assuming expertise, I thought I’d err on the side of assuming naivety. Thus, the table includes many abbreviations which are probably obvious to most readers.
Table of R Commands
R Command Abbreviation Expanded Comments
ls [L]i[S]t objects common command in Unix-like operating systems
rm [R]e[M]ove objects common command in Unix-like operating systems
str [STR]ucture of an object
unz [UNZ]ip
getwd [GET] [W]orking [D]irectory
dir [DIR]ectory
sprintf [S]tring [PRINT] [F]ormatted
c [C]ombine values
regexpr [REG]ular [EXPR]ession Why “regular”? See regular sets, regular language
diag [DIAG]onal values of a matrix
col [COL]umn
lapply [L]ist [APPLY] Apply function to each element and return a list
sapply [S]implify [APPLY ] Apply function to each element and attempt to return a vector (i.e., a vector is “simpler” than a list)
mapply [M]ultivariate [APPLY] Multivariate version of sapply
tapply [T]able [APPLY] Apply function to sets of values as defined by an index
apply [APPLY] function to sets of values as defined by an index
MARGIN = 1 or 2 in apply rows [1] come before columns [2] e.g., a 2 x 3 matrix has 2 rows and 3 columns (note: row count is stated first)
rmvnorm [R]andom number generator for [M]ulti[V]ariate [NORM]al data
rle [R]un [L]ength [E]ncoding
ftable [F]ormat [TABLE]
xtabs Cross (i.e., [X]) [TAB]ulation [X] is the symbol of a cross; [X] is sometimes spoken as “by”. Cross-tabulating means to cross one variable with another
xtable [TABLE] of the object [X]
formatC [FORMAT] using [C] style formats i.e., [C] the programming language
Sweave [S] [WEAVE] The R Programming language is a dialect of S. Weaving involves combining code and documentation
cor [COR]relation
ancova [AN]alysis [O]f [COVA]riance
manova [M]ultivariate [AN]alysis [O]f [VA]riance
aov [A]nalysis [O]f [V]ariance
TukeyHSD [T]ukey’s [H]onestly [S]ignificant [D]ifference
hclust [H]ierarchical [CLUST]er analysis
cmdscale [C]lassical metric [M]ulti[D]imensional [SCAL]ing
factanal [FACT]or [ANAL]ysis
princomp [PRIN]cipal [COMP]onents analysis
prcomp [PR]incipal [COMP]onents analysis
lme [L]inear [M]ixed [E]ffects model
resid [RESID]uals
ranef [RAN]dom [EF]fects
anova [AN]alysis [O]f [VA]riance
fixef [FIX]ed [EF]ffects
vcov [V]ariance-[COV]ariance matrix
logLik [LOG] [LIK]elihood
BIC [B]ayesian [I]nformation [C]riteria
mcmcsamp [M]arkov [Chain] [Monte] [C]arlo [SAMP]ling
eval [EVAL]uate an R expression
cat con[CAT]enate standard Unix command
apropos Search documentation for a purpose or on a topic (i.e., [APROPOS]) Unix command for search documentation;
read.csv [READ] a file in [C]omma [S]eperated [V]alues format i.e., in each row of the data commas separate values for each variable
read.fwf [READ] a file in [F]ixed [W]idth [F]ormat
seq Generate [SEQ]uence
rep [REP]licate values of x perhaps also [REP]eat
dim [DIM]ension of an object Typically, number of rows and columns in a matrix
gl [G]enerate factor [L]evels
rbind [R]ows [BIND]
cbind [C]olumns [BIND]
is.na [IS] [N]ot [A]vailable
nrow [N]umber of [ROW]s
ncol [N]umber of [COL]umns
attr [ATTR]ibute
rev [REV]erse
diff [DIFF]erence between x and a lag of x
prod [PROD]uct
var [VAR]iance
sd [S]tandard [D]eviation
cumsum [CUM]ulative [SUM]
cumprod [CUM]ulative [PROD]uct
setdiff [SET] [DIFF]erence
intersect [INTERSECT]ion
Re [RE]al part of a number
Im [IM]aginary part of a number
Mod [MOD]ulo opertion remainder of division of one number by another
t [T]ranspose of a vector or matrix
substr [SUBSTR]ing
strsplit [STR]ing [SPLIT]
grep [G]lobal / [R]egular [E]xpression / [P]rint Etymology based on text editor instructions in programs such as ed
sub [SUB]stitute identified pattern found in string
gsub [G]lobal [SUB]stitute identified pattern found in string
pmatch [P]artial string [MATCH]ing
nchar [N]umber of [CHAR]acters in a string
ps.options [P]ost-[S]cript [OPTIONS]
win.metafile [WIN]dows [METAFILE] graphic
dev.off [DEV]ice [OFF]
dev.cur [CUR]rent [DEV]ice
dev.set [SET] the current [DEV]ice
hist [HIST]ogram
pie [PIE] Chart
coplot [CO]nditioning [PLOT]
matplot [PLOT] colums of [MAT]rices
assocplot [ASSOC]iation [PLOT]
plot.ts [PLOT] [T]ime [S]eries
qqnorm [Q]uantile-[Q]uantile [P]lot based on normal distribution
persp [PERSP]ective [P]lot
xlim [LIM]it of the [X] axis
ylim [LIM]it of the [Y] axis
xlab [LAB]el for the [X] axis
ylab [LAB]el for the [Y] axis
main [MAIN] title for the plot
sub [SUB] title for the plot
mtext [M]argin [TEXT]
abline [LINE] on plot often of the form y = [A] + [B] x
h argument in abline [H]orizontal line
v argument in abline [V]ertical line
par Graphics [PAR]ameter
adj as par [ADJ]ust text [J]ustification
bg as par [B]ack[G]round colour
bty as par [B]ox [TY]pe
cex as par [C]haracter [EX]tension or [EX]pansion of plotting objects
cex.sub as par [C]haracter [EX]tension or [EX]pansion of [SUB]title
cex.axis as par [C]haracter [EX]tension or [EX]pansion of [AXIS] annotation
cex.lab as par [C]haracter [EX]tension or [EX]pansion X and Y [LAB]els
cex.main as par [C]haracter [EX]tension or [EX]pansion of [MAIN] title
col as par Default plotting [COL]our
las as par [L]abel of [A]xis [S]tyle
lty as par [L]ine [TY]pe
lwd as par [L]ine [W]i[D]th
mar as par [MAR]gin width in lines
mfg as par Next [G]raph for [M]atrix of [F]igures
mfcol as par [M]atrix of [F]igures entered [COL]umn-wise
mfrow as par [M]atrix of [F]igures entered [ROW]-wise
pch as par [P]lotting [CH]aracter
ps as par [P]oint [S]ize of text Point is a printing measurement
pty as par [P]lot region [TY]pe
tck as par [T]i[CK] mark length
tcl as par [T]i[C]k mark [L]ength
xaxs as par [X] [AX]is [S]tyle
yaxs as par [Y] [AX]is [S]tyle
xaxt as par [X] [AX]is [T]ype
yaxt as par [Y] [AX]is [T]ype
asp as par [ASP]ect ratio
xlog as par [X] axis as [LOG]arithm scale
ylog as par [Y] axis as [LOG]arithm scale
omi as par [O]uter [M]argin width in [I]nches
mai as par [MA]rgin width in [I]nches
pin as par [P]lot size in [IN]ches
xpd as par Perhaps: [X = Cut] [P]lot ? Perhaps D for device
xyplot [X] [Y] [PLOT] [X] for horizontal axis; [Y] for vertical axis
bwplot [B]ox and [W]hisker plot
qq [Q]uantile-[Quantile] plot’
splom [S]catter[PLO]t [M]atrix
optim [OPTIM]isation
lm [L]inear [M]odel
glm [G]eneralised [L]inear [M]odel
nls [N]onlinear [L]east [S]quare parameter esetimation
loess [LO]cally [E]stimated [S]catterplot [S]moothing
prop.test [TEST] null hypothesis that [PROP]ortions in several gropus are the same
rnorm [R]andom number drawn from [NORM]al distribution
dnorm [D]ensity of a given quantile in a [NORM]al distribution
pnorm [D]istribution function for [NORM]al distribution returning cumulaive [P]robability
qnorm [Q]uantile function based on [NORM]al distribution
rexp [R]andom number generation from [EXP]onential distribution
rgamma [R]andom number generation from [GAMMA] distribution
rpois [R]andom number generation from [POIS]on distribution
rweibull [R]andom number generation from [WEIBULL] distribution
rcauchy [R]andom number generation from [CAUCHY] distribution
rbeta [R]andom number generation from [BETA] distribution
rt [R]andom number generation from [t] distribution
rf [R]andom number generation from [F] distribution F for Ronald [F]isher
rchisq [R]andom number generation from [CHI] [SQ]uare distribution
rbinom [R]andom number generation from [BINOM]ial distribution
rgeom [R]andom number generation from [EXP]onential distribution
rhyper [R]andom number generation from [HYPER]geometric distribution
rlogis [R]andom number generation from [LOGIS]tic distribution
rlnorm [R]andom number generation from [L]og [NOR]mal distribution
rnbinom [R]andom number generation from [N]egative [BINOM]ial distribution
runif [R]andom number generation from [UNIF]orm distribution
rwilcox [R]andom number generation from [WILCOX]on distribution
ggplot in ggplot2 [G]rammar of [G]raphics [PLOT] See Leland Wilkinson (1999)
aes in ggplot2 [AES]thetic mapping
geom_ in ggplot2 [GEOM]etric object
stat_ in ggplot2 [STAT]istical summary
coord_ in ggplot2 [COORD]inate system
qplot in ggplot2 [Q]uick [PLOT]
x as argument [X] is common letter for unknown variable in math
FUN as argument [FUN]ction
pos as argument [POS]ition
lib.loc in library [LIB]rary folder [LOC]ation
sep as argument [SEP]erator character
comment.char in read.table [COMMENT] [CHAR]acter(s)
I [I]nhibit [I]nterpretation or [I]nsulate
T value [T]rue
F value [F]alse
na.rm as argument [N]ot [A]vailable [R]e[M]oved
fivenum [FIVE] [NUM]ber summary
IQR [I]nter [Q]uartile [R]ange
coef Model [COEF]ficients
dist [DIST]ance matrix
df as argument [D]egrees of [F]reedom
mad [M]edian [A]bsolute [D]eviation
sink Divert R output to a connection (i.e., like connecting a pipe to a [SINK])
eol in write.table [End] [O]f [L]ine character(s)
R as software [R]oss Ihaka and [R]obert Gentleman or [R] is letter before S
CRAN as word [C]omprehensive [R] [A]rchive [N]etwork As I understand it: Inpsired by CTAN (Comprehensive TeX Archive Network); pronunciation of CRAN rhymes with CTAN (i.e., “See” ran as in Iran; “See tan”)
Sexpr [S] [EXPR]ession
ls.str Show [STR]ucture of [L]i[S]ted objects
browseEnv [BROWSE] [ENV]ironment
envir as argument [ENVIR]onment
q [Q]uit
cancor [CAN]onical [COR]relation
ave [AVE]rage
min [MIN]imum
max [MAX]imum
sqrt [SQ]uare [R]oo[T]
%o% [O]uter product
& & is ampersand meaning [AND]
| | often used to represent OR in computing (http://en.wikipedia.org /wiki /Logical_disjunction)
: sequence generator; aslo used in MATLAB
nlevels [N]umber of [LEVELS] in a factor
det [DET]erminant of a matrix
crossprod Matrix [CROSSPROD]uct
gls [G]eneralised [L]east [S]quares
dwtest in lmtest [D]urbin-[W]atson Test
sem in sem [S]tructural [E]quation [M]odel
betareg in betareg [BETA] [REG]ression
log Natural [LOG]arithm Default base is e consistent with most mathematics (http://en.wikipedia.org /wiki /Logarithm#Implicit_bases)
log10 [LOG]arithm base 10
fft [F]ast [F]ourier [T]ransform
exp [EXP]onential function i.e., e^x
df.residual [D]egrees of [F]reedom of the [R]esidual
sin [SIN]e function
cos [COS]ine function
tan [TAN]gent function
asin [A]rc[SIN]e function
acos [A]rc[COS]ine function
atan [A]rc[TAN]gent function
deriv [DERIV]ative
chol [Choleski] decomposition
chol2inv [CHOL]eski [2=TO] [INV]erse
svd [S]ingular [V]alue [D]ecomposition
eigen [EIGEN]value or [EIGEN]vector
lower.tri [LOWER] [TRI]angle of a matrix
upper.tri [UPPER] [TRI]angle of a matrix
acf [A]uto [C]orrelation or [C]ovariance [F]unction
pacf [P]artial A]uto [C]orrelation or [C]ovariance [F]unction
ccf [C]ross [C]orrelation or [C]ovariance [F]unction
Rattle as software [R] [A]nalytical [T]ool [T]o [L]earn [E]asily Perhaps, easy like a baby’s rattle
StatET as software Anyone know? Statistics Eclipse?
JGR as software [J]ava [G]UI for [R] pronounced “Jaguar” like the cat
ESS as software [E]macs [S]peaks [S]tatistics
Rcmdr package [R] [C]o[m]man[d]e[r] GUI
prettyNum [PRETTY] [NUM]ber
Inf value [Inf]inite
NaN value [N]ot [A] [N]umber
is.nan [IS] [N]ot [A] [N]umber
S3 R is a dialect of [S]; 3 is the version number
S4 R is a dialect of [S]; 4 is the version number
Rterm as program [R] [TERM]inal
R CMD as program I think: [R] [C]o[m]man[D] prompt
repos as option [REPOS]itory locations
bin folder [BIN]aries Common Unix folder for “essential command binaries”
etc folder [et cetera] Common Unix folder for “host-specific system-wide configuration files
src folder [S]ou[RC]e [C]ode Common Unix folder
doc folder [DOC]umentation
RGUI program [R] [G]rapical [U]ser [I]nterface
.site file extension [SITE] specific file e.g., RProfile.site
Hmisc package Frank [HARRELL]‘s package of [MISC]elaneous functions
n in debug [N]ext step
c in debug [C]ontinue
Q in debug [Q]uit
MASS package [M]odern [A]pplied [S]tatistics with [S] Based on book of same name by Venables and Ripley
plyr package PL[Y=ie][R] Double play on words: (1) package manipulates data like pliers manipulate materials; (2) last letter is R as in the program
aaply input [A]rray output [A]rray using [PLY]r package
daply input [D]ata frame output [A]rray using [PLY]r package
laply input [L]ist output [A]rray using [PLY]r package
adply input [A]rray output [D]ata frame using [PLY]r package
alply input [A]rray output [L]ist using [PLY]r package
a_ply input [A]rray output Discarded (i.e., _ is blank) using [PLY]r package
RODBC package [R] [O]bject [D]ata[B]ase [C]onnectivity
psych package [PSYCH]ology related functions
zelig package “Zelig is named after a Woody Allen movie about a man who had the strange ability to become the physical and psychological reflection of anyone he met and thus to fit perfectly in any situation.” – http://gking. harvard.edu/ zelig/
strucchange package [STRUC]tural [CHANGE]
relaimpo package [RELA]tive [IMPO]rtance
car package [C]ompanion to [A]pplied [R]egression Named after book by John Fox
OpenMx packge [OPEN] Source [M]atri[X] algebra interpreter Need confirmation that [Mx] means matrix
df in write.foreign [D]ata [F]rame
GNU S word [GNU] is [N]ot [U]nix [S]
R FAQ word R [F]requently [A]sked [Q]uestions
DVI format [D]e[V]ice [I]ndependent file format
devel word [DEVEL]opment as in code under development
GPL word [G]eneral [P]ublic [L]icense
utils package [UTIL]itie[S]
mle [M]aximum [L]ikelihood [E]stimation
rpart package [R]ecursive [PART]itioning
sna package [S]ocial [N]etwork [A]nalysis
ergm package [E]xponential [R]andom [G]raph [M]odels
rbugs package [R] interface to program [B]ayesian inference [Using] [G]ibbs [S]ampling

Posted in R Tips | Leave a Comment »

Matlab input parameters using linux command

Posted by yangfeng on April 3, 2010

nohup matlab -nodisplay -nojvm -nosplash -r “randSeed=72;rho=0;s0=40;”  <input.m>  output.o&

Posted in Matlab | Leave a Comment »

EPS figures and PDF conversion (Credit to Gary)

Posted by yangfeng on March 28, 2010

Original Post: http://electron.mit.edu/~gsteele/pdf/

Background

When submitting papers to journals or online archives, one of the common submission methods is to write your paper in LaTeX and submit the figures as EPS files.

LaTeX was originally designed to produce an “Device-Independent” .dvi file that could then be converted to postscript file for printing using another program, such as dvips. In the old days, the standard way to exchange a paper was to send it as postscript (or gzipped postscript), as this contained the actual layout with embedded figures.

Recently, postscript has been eclipsed by the PDF standard, which has many more features built in, such as text compression and image compression. In order to convert a LaTeX .dvi file to PDF, many useful tools are available. In particular, I find that dvipdfm is the most effective, as it automatically includes the fonts as vector outlines instead of bitmaps, a problem that plagued the dvips/ps2pdf conversion option. It also supports many latex extensions that take advantage of PDFs advanced features, such as hyperlinking and embedding movies.

Natively, PDF supports many different different options for compressing images in order to reduce file size. Two of the common ones are:

  • DCTEncodeThis is a lossy compression technique similar to that used in JPEG files. A Discrete-Cosine transformation of the data is taken and small coefficients are discarded. The number of coefficients to discard, and hence the file size and image quality, are controlled by the quality (or “quantization”) factor. While this produces small file sizes, it also can significantly degrade the image quality if the “quality factor” is too low. Usually, this results in “ringing” around sharp lines: it is particularly obvious around black lines or letters, where it produces fuzzyness and “speckles”.

    This setting producess “lossy” compression, and may be necessary for high resolution photographic images, where it produces acceptable quality. It is particularly poorly suited for things like graphs that include only solid lines on a white background.

  • FlateEncodeThis is a non-lossy ZLIB compression that is similar to “zipping” the image using software like gzip. It is equivalent to the compression that is used in PNG images. This is ideal for figures that include a small number of colors, such as plots with colored lines and symbols, as it will compress well and exactly reproduces the image you started with.

To show what the difference between these two looks like, compare the following two images:

Lossless (FlateEncode/Zlib) Compression (470 bytes):

Lossy (DCTEncode/jpeg) compresssion with low quality factor (919 bytes):

The DCT compressed image is fuzzier, and shows “ringing” noise around the edges of the text. Note also that for images such as these that have only a few colors, DCTEncoding actually produces a larger file than FlateEncoding.

For more details about these and other image compression setting included in the PDF standard, I would reccommend reading the “Using the Image Settings” section of the Adobe PDF Creation Settings manual available from Adobe’s Acrobat SDK Documentation developer’s resource page. The Adobe Distiller Parameters manual is also quite useful.

One of the problems with pdf conversion is that most pdf converters (“distillers”) are configured by default to always use DCTEncoding for color and grayscale images. For a scientific paper, this produces very poor results.

Note that even such reputed publishers as Science and Nature suffer from these problems: if you zoom in on a figure from an online PDF from one of these publishers, the figures clearly show the signs of DCT compression with a low quality factor.

Raterizing Figures

This all becomes even more relevant when submitting figures to an online archive server, such as the arXiv preprint server. The arXiv server limits the size of submissions to 1MB, which can make including high quality figures difficult. In particular, postscript figures of plots with a lot of points or lines can easily vastly exceed this limit.

The best solution is to take your vector postscript figures and “rasterize” them at a fixed resolution by converting them to either PNG image files or JPEG image files with a high quality setting.

A good way to do this is using ghostscript:

$ gs -r300 -dEPSCrop -dTextAlphaBits=4 -sDEVICE=png16m -sOutputFile=fig.png -dBATCH -dNOPAUSE fig.eps

You can set the image resolution in pixels per inch using the -r flag. Make sure to include the -dEPSCrop option to crop the output to the size of the bounding box. The -dTextAlphaBits=4 option will anti-alias fonts in the EPS file so they have smooth looking edges. In general, printers are capable of at least 300 dpi, although I find you can go down to 150 dpi before it becomes really noticeable to the eye. Changing the resolution is by far the biggest way to impact the file size.

Once the figures are rasterized, the raster image can be encapsulated into an EPS file using programs like ImageMagick or imgtops. The imgtops webpage has an excellent discussion of the subtleties of this step. Imagemagick is included in cygwin, making it easy to use on a windows computer.

An important consideration is what postscript compatibility level you can use. As discussed in the imgtops page linked above, newer postscript versions support much better internal image formats. Level 1 uses only ascii-coded RGB values, and is very wasteful, producing very large files. Level 2 includes support for JPEG encoded images, which produces much smaller files. Level 3 includes support for Zlib compression, making it well suited for making EPS files from png files.

In general, level 3 will produce the smallest files. Level 2 provides the best compatibility, and works well with jpeg images.

If you decide to use level 2 postscript, I recommend converting first to a jpg file. The “convert” program included Imagemagick uses a quality factor in “percent” that ranges from 0 to 100:

$ convert -quality 80 fig.png fig.jpg

I find a quality factor of 80 on high resolution images gives good compresssion without too much loss in quality. You can then to convert the image to eps using “convert” with the eps2 settings:

$ convert fig.jpg eps2:fig.eps

If you can use level 3 postscript, you can convert directly from png to eps:

$ convert fig.png eps3:fig.eps

Using level 3 postscript from a png image file for scientific figures will often produce a very small eps file. Ghostscript is compatible with these level 3 eps files, so this is often a good way to go.

By adjusting a combination of the jpeg quality factor and the image resolution at the rasterization step, you can tweak the images to get an EPS file exactly the size you need while maintaining the highest possible quality and level of detail.

More information about this rasterization process is available from the arXiv Bitmapping Figures page.

In general, I have found that is easy to produce relatively high quality raterized images this way that are small enough to squeeze inside the arXiv 1MB submisssion size limit.

However, this is not the end of the story…

PDF Conversion

This is where the real problem is. As I mentioned above, the default settings for most PDF distillers is to always compress color images using DCTEncoding, and the default quality factor is usually quite low.

This means that even though we have gone to all the trouble of tweaking our image resolution and quality factors to get the best quality images possible for our 1MB file limit, the images will be recompresssed by the PDF conversion software when somebody downloads a pdf of our paper. Furthermore, this recompression at the PDF conversion stage will involve a low quality factor, and the figures that will be in the PDF file will be of remarkably poor quality.

Fortunately, there is a way around this. The Adobe specification also defines a special set of Postscript commands that can be used inside of a postscript file to control the settings that the postscript to pdf conversion software uses. By manually editing your EPS files to include these special postscript commands, you can tell the PDF distiller exactly which types of image compression to use. This will work for any distiller that is compatible with the Adobe specifications, which fortunately includes the PDF conversion abilities of ghostscript.

In order to get high quality figures in the converted PDF, you can either tell the PDF distiller to use FlateEncode or to use DCTEncode with a high quality factor. Here are the postscript snippets that allow you to do this:

  • Use DCTEncode with a high quality factor: this involves setting a parameter called “/Qfactor” to a small number. The /Qfactor parameter actually refers to “Quantization factor”. Setting this to 0.15 uses the same settings as “Maximum Quality” mode for Acrobat distiller.
    systemdict /setdistillerparams known {
    << /ColorACSImageDict << /QFactor 0.15 /Blend 1 /ColorTransform 1 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >>
        >> setdistillerparams
    } if
    
  • Use FlateEncode: this way, the images that you see in the converted pdf file will be exactly identical to the EPS images you submit.
    systemdict /setdistillerparams known {
    << /AutoFilterColorImages false /ColorImageFilter /FlateEncode >> setdistillerparams
    } if
    

To use these, simply open up your .eps file in a text editor such as emacs and insert the text after the end of the “%” commented area at the beginning of the file. This should automatically work with dvipdfm conversion as well as the pdf conversion software used on the arXiv server.

Posted in Latex Tips | Leave a Comment »

 
Follow

Get every new post delivered to your Inbox.