Friday, January 7, 2011

A Visual Explanation of SQL Joins

SkyHi @ Friday, January 07, 2011
I thought Ligaya Turmelle's post on SQL joins was a great primer for novice developers. Since SQL joins appear to be set-based, the use of Venn diagrams to explain them seems, at first blush, to be a natural fit. However, like the commenters to her post, I found that the Venn diagrams didn't quite match the SQL join syntax reality in my testing.
I love the concept, though, so let's see if we can make it work. Assume we have the following two tables. Table A is on the left, and Table B is on the right. We'll populate them with four records each.

id name       id  name
-- ----       --  ----
1  Pirate     1   Rutabaga
2  Monkey     2   Pirate
3  Ninja      3   Darth Vader
4  Spaghetti  4   Ninja
Let's join these tables by the name field in a few different ways and see if we can get a conceptual match to those nifty Venn diagrams.

ON =

id  name       id   name
--  ----       --   ----
1   Pirate     2    Pirate
3   Ninja      4    Ninja
Inner join produces only the set of records that match in both Table A and Table B.
Venn diagram of SQL inner join
ON =

id    name       id    name
--    ----       --    ----
1     Pirate     2     Pirate
2     Monkey     null  null
3     Ninja      4     Ninja
4     Spaghetti  null  null
null  null       1     Rutabaga       
null  null       3     Darth Vader
Full outer join produces the set of all records in Table A and Table B, with matching records from both sides where available. If there is no match, the missing side will contain null.
Venn diagram of SQL cartesian join

ON =

id  name       id    name
--  ----       --    ----
1   Pirate     2     Pirate
2   Monkey     null  null
3   Ninja      4     Ninja
4   Spaghetti  null  null
Left outer join produces a complete set of records from Table A, with the matching records (where available) in Table B. If there is no match, the right side will contain null.
Venn diagram of SQL left join
ON =

id  name       id     name
--  ----       --     ----
2   Monkey     null   null
4   Spaghetti  null   null
To produce the set of records only in Table A, but not in Table B, we perform the same left outer join, then exclude the records we don't want from the right side via a where clause.
ON =
WHERE IS null 
OR IS null
id    name       id    name
--    ----       --    ----
2     Monkey     null  null
4     Spaghetti  null  null
null  null       1     Rutabaga
null  null       3     Darth Vader
To produce the set of records unique to Table A and Table B, we perform the same full outer join, then exclude the records we don't want from both sides via a where clause.
There's also a cartesian product or cross join, which as far as I can tell, can't be expressed as a Venn diagram:
This joins "everything to everything", resulting in 4 x 4 = 16 rows, far more than we had in the original sets. If you do the math, you can see why this is a very dangerous join to run against large tables.


Best of Vim Tips

SkyHi @ Friday, January 07, 2011
David Rayner (zzapper) 15 Years of Vi + 7 years of Vim and still learning 12Oct10 : Last Update

Vim Tips Blog (NEW)

Vim @ Squidoo





Buy Vim Book Support VIM

Submit to Social Websites  

" new items marked *N* , corrected items marked *C*
" searching
/joe/e                      : cursor set to End of match
3/joe/e+1                   : find 3rd joe cursor set to End of match plus 1 *C*
/joe/s-2                    : cursor set to Start of match minus 2
/joe/+3                     : find joe move cursor 3 lines down
/^joe.*fred.*bill/          : find joe AND fred AND Bill (Joe at start of line)
/^[A-J]/                    : search for lines beginning with one or more A-J
/begin\_.*end               : search over possible multiple lines
/fred\_s*joe/               : any whitespace including newline *C*
/fred\|joe                  : Search for FRED OR JOE
/.*fred\&.*joe              : Search for FRED AND JOE in any ORDER!
/\<fred\>/                  : search for fred but not alfred or frederick *C*
/\<\d\d\d\d\>               : Search for exactly 4 digit numbers
/\D\d\d\d\d\D               : Search for exactly 4 digit numbers
/\<\d\{4}\>                 : same thing
/\([^0-9]\|^\)%.*%          : Search for absence of a digit or beginning of line
" finding empty lines
/^\n\{3}                    : find 3 empty lines
/^str.*\nstr                : find 2 successive lines starting with str
/\(^str.*\n\)\{2}           : find 2 successive lines starting with str
" using rexexp memory in a search
" Repeating the Regexp (rather than what the Regexp finds)
" visual searching
:vmap // y/<C-R>"<CR>       : search for visually highlighted text
:vmap <silent> //    y/<C-R>=escape(@", '\\/.*$^~[]')<CR><CR> : with spec chars
" \zs and \ze regex delimiters :h /\zs
/<\zs[^>]*\ze>              : search for tag contents, ignoring chevrons
" zero-width :h /\@=
/<\@<=[^>]*>\@=             : search for tag contents, ignoring chevrons
/<\@<=\_[^>]*>\@=           : search for tags across possible multiple lines
" searching over multiple lines \_ means including newline
/<!--\_p\{-}-->                   : search for multiple line comments
/fred\_s*joe/                     : any whitespace including newline *C*
/bugs\(\_.\)*bunny                : bugs followed by bunny anywhere in file
:h \_                             : help
" search for declaration of subroutine/function under cursor
:nmap gx yiw/^\(sub\<bar>function\)\s\+<C-R>"<CR>
" multiple file search
:bufdo /searchstr/                : use :rewind to recommence search
" multiple file search better but cheating
:bufdo %s/searchstr/&/gic   : say n and then a to stop
" How to search for a URL without backslashing
?        : (first) search BACKWARDS!!! clever huh!
" Specify what you are NOT searching for (vowels)
/\c\v([^aeiou]&\a){4}       : search for 4 consecutive consonants
/\%>20l\%<30lgoat           : Search for goat between lines 20 and 30 *N*
/^.\{-}home.\{-}\zshome/e   : match only the 2nd occurence in a line of "home" *N*
:%s/home.\{-}\zshome/alone  : Substitute only the occurrence of home in any line *N*
" find str but not on lines containing tongue
:%s/fred/joe/igc            : general substitute command
:%s//joe/igc                : Substitute what you last searched for *N*
:%s/~/sue/igc               : Substitute your last replacement string *N*
:%s/\r//g                   : Delete DOS returns ^M
" Is your Text File jumbled onto one line? use following
:%s/\r/\r/g                 : Turn DOS returns ^M into real returns
:%s=  *$==                  : delete end of line blanks
:%s= \+$==                  : Same thing
:%s#\s*\r\?$##              : Clean both trailing spaces AND DOS returns
:%s#\s*\r*$##               : same thing
" deleting empty lines
:%s/^\n\{3}//               : delete blocks of 3 empty lines
:%s/^\n\+/\r/               : compressing empty lines
:%s#<[^>]\+>##g             : delete html tags, leave text (non-greedy)
:%s#<\_.\{-1,}>##g          : delete html tags possibly multi-line (non-greedy)
:%s#.*\(\d\+hours\).*#\1#   : Delete all but memorised string (\1) *N*
%s#><\([^/]\)#>\r<\1#g      : split jumbled up XML file into one tag per line *N*
" VIM Power Substitute
:'a,'bg/fred/s/dick/joe/igc : VERY USEFUL
" duplicating columns
:%s= [^ ]\+$=&&=            : duplicate end column
:%s= \f\+$=&&=              : same thing
:%s= \S\+$=&&               : usually the same
" memory
%s#.*\(tbl_\w\+\).*#\1#     : produce a list of all strings tbl_*   *N*
:s/\(.*\):\(.*\)/\2 : \1/   : reverse fields separated by :
:%s/^\(.*\)\n\1$/\1/        : delete duplicate lines
" non-greedy matching \{-}
:%s/^.\{-}pdf/new.pdf/      : delete to 1st occurence of pdf only (non-greedy)
" use of optional atom \?
:%s#\<[zy]\?tbl_[a-z_]\+\>#\L&#gc : lowercase with optional leading characters
" over possibly many lines
:%s/<!--\_.\{-}-->//        : delete possibly multi-line comments
:help /\{-}                 : help non-greedy
" substitute using a register
:s/fred/<c-r>a/g            : sub "fred" with contents of register "a"
:s/fred/\=@a/g              : better alternative as register not displayed
" multiple commands on one line
:%s/\f\+\.gif\>/\r&\r/g | v/\.gif$/d | %s/gif/jpg/
:%s/a/but/gie|:update|:next : then use @: to repeat
" ORing
:%s/goat\|cow/sheep/gc      : ORing (must break pipe)
:'a,'bs#\[\|\]##g           : remove [] from lines between markers a and b *N*
:%s/\v(.*\n){5}/&\r         : insert a blank line every 5 lines *N*
" Calling a VIM function
:s/__date__/\=strftime("%c")/ : insert datestring
" Working with Columns sub any str1 in col3
" Swapping first & last column (4 columns)
" format a mysql query 
:%s#\<from\>\|\<where\>\|\<left join\>\|\<\inner join\>#\r&#g
" filter all form elements into paste register
:redir @*|sil exec 'g#<\(input\|select\|textarea\|/\=form\)\>#p'|redir END
:nmap ,z :redir @*<Bar>sil exec 'g@<\(input\<Bar>select\<Bar>textarea\<Bar>/\=form\)\>@p'<Bar>redir END<CR>
" substitute string in column 30 *N*
" decrement numbers by 3
" increment numbers by 6 on certain lines only
" better
:h /\zs
" increment only numbers gg\d\d  by 6 (another way)
:h zero-width
" rename a string with an incrementing number
:let i=10 | 'a,'bg/Abc/s/yy/\=i/ |let i=i+1 # convert yy to 10,11,12 etc
" as above but more precise
:let i=10 | 'a,'bg/Abc/s/xx\zsyy\ze/\=i/ |let i=i+1 # convert xxyy to xx11,xx12,xx13
" find replacement text, put in memory, then use \zs to simplify substitute
" Pull word under cursor into LHS of a substitute
:nmap <leader>z :%s#\<<c-r>=expand("<cword>")<cr>\>#
" Pull Visually Highlighted text into LHS of a substitute
:vmap <leader>z :<C-U>%s/\<<c-r>*\>/
" substitute singular or plural
:'a,'bs/bucket\(s\)*/bowl\1/gic   *N*
" all following performing similar task, substitute within substitution
" Multiple single character substitution in a portion of line only
:%s,\(all/.*\)\@<=/,_,g     : replace all / with _ AFTER "all/"
" Same thing
:s#all/\zs.*#\=substitute(submatch(0), '/', '_', 'g')#
" Substitute by splitting line, then re-joining
" Substitute inside substitute
:%s/.*/\='cp '.submatch(0).' all/'.substitute(submatch(0),'/','_','g')/
" global command display 
:g/gladiolli/#              : display with line numbers (YOU WANT THIS!)
:g/fred.*joe.*dick/         : display all lines fred,joe & dick
:g/\<fred\>/                : display all lines fred but not freddy
:g/^\s*$/d                  : delete all blank lines
:g!/^dd/d                   : delete lines not containing string
:v/^dd/d                    : delete lines not containing string
:g/joe/,/fred/d             : not line based (very powerfull)
:g/fred/,/joe/j             : Join Lines *N*
:g/-------/.-10,.d          : Delete string & 10 previous lines
:g/{/ ,/}/- s/\n\+/\r/g     : Delete empty lines but only between {...}
:v/\S/d                     : Delete empty lines (and blank lines ie whitespace)
:v/./,/./-j                 : compress empty lines
:g/^$/,/./-j                : compress empty lines
:g/<input\|<form/p          : ORing
:g/^/put_                   : double space file (pu = put)
:g/^/m0                     : Reverse file (m = move)
:g/^/m$                     : No effect! *N*
:'a,'bg/^/m'b               : Reverse a section a to b
:g/^/t.                     : duplicate every line
:g/fred/t$                  : copy(transfer) lines matching fred to EOF
:g/stage/t'a                : copy (transfer) lines matching stage to marker a (cannot use .) *C*
:g/^Chapter/t.|s/./-/g      : Automatically underline selecting headings *N*
:g/\(^I[^^I]*\)\{80}/d      : delete all lines containing at least 80 tabs
" perform a substitute on every other line
:g/^/ if line('.')%2|s/^/zz / 
" match all lines containing "somestr" between markers a & b
" copy after line containing "otherstr"
:'a,'bg/somestr/co/otherstr/ : co(py) or mo(ve)
" as above but also do a substitution
:%norm jdd                  : delete every other line
" incrementing numbers (type <c-a> as 5 characters)
:.,$g/^\d/exe "norm! \<c-a>": increment numbers
:'a,'bg/\d\+/norm! ^A       : increment numbers
" storing glob results (note must use APPEND) you need to empty reg a first with qaq. 
"save results to a register/paste buffer
:g/fred/y A                 : append all lines fred to register a
:g/fred/y A | :let @*=@a    : put into paste buffer
:let @a=''|g/Barratt/y A |:let @*=@a
" filter lines to a file (file must already exist)
:'a,'bg/^Error/ . w >> errors.txt
" duplicate every line in a file wrap a print '' around each duplicate
:g/./yank|put|-1s/'/"/g|s/.*/Print '&'/
" replace string with contents of a file, -d deletes the "mark"
:g/^MARK$/r tmp.txt | -d
" display prettily
:g/<pattern>/z#.5           : display with context
:g/<pattern>/z#.5|echo "=========="  : display beautifully
" Combining g// with normal mode commands
:g/|/norm 2f|r*                      : replace 2nd | with a star
"send output of previous global command to a new window
:nmap <F3>  :redir @a<CR>:g//<CR>:redir END<CR>:new<CR>:put! a<CR><CR>
" Global combined with substitute (power editing)
:'a,'bg/fred/s/joe/susan/gic :  can use memory to extend matching
:/fred/,/joe/s/fred/joe/gic :  non-line based (ultra)
:/biz/,/any/g/article/s/wheel/bucket/gic:  non-line based *N*
" Find fred before beginning search for joe
" create a new file for each line of file eg 1.txt,2.txt,3,txt etc
:g/^/exe ".w ".line(".").".txt"
" chain an external command
:.g/^/ exe ".!sed 's/N/X/'" | s/I/Q/    *N*
" Operate until string found *N*
d/fred/                                :delete until fred
y/fred/                                :yank until fred
c/fred/e                               :change until fred end
" Summary of editing repeats *N*
.      last edit (magic dot)
:&     last substitute
:%&    last substitute every line
:%&gic last substitute every line confirm
g%     normal mode repeat last substitute
g&     last substitute on all lines
@@     last recording
@:     last command-mode command
:!!    last :! command
:~     last substitute
:help repeating
" Summary of repeated searches
;      last f, t, F or T
,      last f, t, F or T in opposite direction
n      last / or ? search
N      last / or ? search in opposite direction
" Absolutely essential
* # g* g#           : find word under cursor (<cword>) (forwards/backwards)
%                   : match brackets {}[]()
.                   : repeat last modification 
@:                  : repeat last : command (then @@)
matchit.vim         : % now matches tags <tr><td><script> <?php etc
<C-N><C-P>          : word completion in insert mode
<C-X><C-L>          : Line complete SUPER USEFUL
/<C-R><C-W>         : Pull <cword> onto search/command line
/<C-R><C-A>         : Pull <CWORD> onto search/command line
:set ignorecase     : you nearly always want this
:set smartcase      : overrides ignorecase if uppercase used in search string (cool)
:syntax on          : colour syntax in Perl,HTML,PHP etc
:set syntax=perl    : force syntax (usually taken from file extension)
:h regexp<C-D>      : type control-D and get a list all help topics containing
                      regexp (plus use TAB to Step thru list)
:nmap ,s :source $VIM/_vimrc
:nmap ,v :e $VIM/_vimrc
:e $MYVIMRC         : edits your _vimrc whereever it might be  *N*
" How to have a variant in your .vimrc for different PCs *N*
ab mypc vista
ab mypc dell25
" splitting windows
:vsplit other.php       # vertically split current file with other.php *N*
"VISUAL MODE (easy to add other HTML Tags)
:vmap sb "zdi<b><C-R>z</b><ESC>  : wrap <b></b> around VISUALLY selected Text
:vmap st "zdi<?= <C-R>z ?><ESC>  : wrap <?=   ?> around VISUALLY selected Text
"vim 7 tabs
vim -p fred.php joe.php             : open files in tabs
:tabe fred.php                      : open fred.php in a new tab
:tab ball                           : tab open files
" vim 7 forcing use of tabs from .vimrc
:nnoremap gf <C-W>gf
:cab      e  tabe
:tab sball                           : retab all files in buffer (repair) *N*
" Exploring
:e .                            : file explorer
:Exp(lore)                      : file explorer note capital Ex
:Sex(plore)                     : file explorer in split window
:browse e                       : windows style browser
:ls                             : list of buffers
:cd ..                          : move to parent directory
:args                           : list of files
:args *.php                     : open list of files (you need this!)
:lcd %:p:h                      : change to directory of current file
:autocmd BufEnter * lcd %:p:h   : change to directory of current file automatically (put in _vimrc)
" Changing Case
guu                             : lowercase line
gUU                             : uppercase line
Vu                              : lowercase line
VU                              : uppercase line
g~~                             : flip case line
vEU                             : Upper Case Word
vE~                             : Flip Case Word
ggguG                           : lowercase entire file
" Titlise Visually Selected Text (map for .vimrc)
vmap ,c :s/\<\(.\)\(\k*\)\>/\u\1\L\2/g<CR>
" titlise a line
nmap ,t :s/.*/\L&/<bar>:s/\<./\u&/g<cr>  *N*
" Uppercase first letter of sentences
gf                              : open file name under cursor (SUPER)
:nnoremap gF :view <cfile><cr>  : open file under cursor, create if necessary
ga                              : display hex,ascii value of char under cursor
ggVGg?                          : rot13 whole file
ggg?G                           : rot13 whole file (quicker for large file)
:8 | normal VGg?                : rot13 from line 8
:normal 10GVGg?                 : rot13 from line 8
<C-A>,<C-X>                     : increment,decrement number under cursor
                                  win32 users must remap CNTRL-A
<C-R>=5*5                       : insert 25 into text (mini-calculator)
" Make all other tips superfluous
:h 42            : also
:h holy-grail
" disguise text (watch out) *N*
ggVGg?                          : rot13 whole file (toggles)
:set rl!                        : reverse lines right to left (toggles)
:g/^/m0                         : reverse lines top to bottom (toggles)
" Markers & moving about
'.               : jump to last modification line (SUPER)
`.               : jump to exact spot in last modification line
g;               : cycle thru recent changes (oldest first)
g,               : reverse direction 
:h changelist    : help for above
<C-O>            : retrace your movements in file (starting from most recent)
<C-I>            : retrace your movements in file (reverse direction)
:ju(mps)         : list of your movements
:help jump-motions
:history         : list of all your commands
:his c           : commandline history
:his s           : search history
q/               : Search history Window (puts you in full edit mode) (exit CTRL-C)
q:               : commandline history Window (puts you in full edit mode) (exit CTRL-C)
:<C-F>           : history Window (exit CTRL-C)
" Abbreviations & Maps
" Following 4 maps enable text transfer between VIM sessions
:map   <f7>   :'a,'bw! c:/aaa/x       : save text to file x
:map   <f8>   :r c:/aaa/x             : retrieve text 
:map   <f11>  :.w! c:/aaa/xr<CR>      : store current line
:map   <f12>  :r c:/aaa/xr<CR>        : retrieve current line
:ab php          : list of abbreviations beginning php
:map ,           : list of maps beginning ,
" allow use of F10 for mapping (win32)
set wak=no       : :h winaltkeys
" For use in Maps
<CR>             : carriage Return for maps
<ESC>            : Escape
<LEADER>         : normally \
<BAR>            : | pipe
<BACKSPACE>      : backspace
<SILENT>         : No hanging shell window
#display RGB colour under the cursor eg #445588
:nmap <leader>c :hi Normal guibg=#<c-r>=expand("<cword>")<cr><cr>
map <f2> /price only\\|versus/ :in a map need to backslash the \
# type table,,, to get <table></table>       ### Cool ###
imap ,,, <esc>bdwa<<esc>pa><cr></<esc>pa><esc>kA
" Simple PHP debugging display all variables yanked into register a
iab phpdb exit("<hr>Debug <C-R>a  ");
" Using a register as a map (preload registers in .vimrc)
:let @m=":'a,'bs/"
:let @s=":%!sort -u"
" Useful tricks
"ayy@a           : execute "Vim command" in a text file
yy@"             : same thing using unnamed register
u@.              : execute command JUST typed in
"ddw             : store what you delete in register d *N*
"ccaw            : store what you change in register c *N*
" Get output from other commands (requires external programs)
:r!ls -R         : reads in output of ls
:put=glob('**')  : same as above                 *N*
:r !grep "^ebay" file.txt  : grepping in content   *N*
:20,25 !rot13    : rot13 lines 20 to 25   *N*
!!date           : same thing (but replaces/filters current line)
" Sorting with external sort
:%!sort -u       : use an external program to filter content
:'a,'b!sort -u   : use an external program to filter content
!1} sort -u      : sorts paragraph (note normal mode!!)
:g/^$/;,/^$/-1!sort : Sort each block (note the crucial ;)
" Sorting with internal sort
:sort /.*\%2v/   : sort all lines on second column *N*
" number lines 
:new | r!nl #                  *N*
" Multiple Files Management (Essential)
:bn              : goto next buffer
:bp              : goto previous buffer
:wn              : save file and move to next (super)
:wp              : save file and move to previous
:bd              : remove file from buffer list (super)
:bun             : Buffer unload (remove window but not from list)
:badd file.c     : file from buffer list
:b3              : go to buffer 3 *C*
:b main          : go to buffer with main in name eg main.c (ultra)
:sav php.html    : Save current file as php.html and "move" to php.html
:sav! %<.bak     : Save Current file to alternative extension (old way)
:sav! %:r.cfm    : Save Current file to alternative extension
:sav %:s/fred/joe/           : do a substitute on file name
:sav %:s/fred/joe/:r.bak2    : do a substitute on file name & ext.
:!mv % %:r.bak   : rename current file (DOS use Rename or DEL)
:help filename-modifiers
:e!              : return to unmodified file
:w c:/aaa/%      : save file elsewhere
:e #             : edit alternative file (also cntrl-^)
:rew             : return to beginning of edited files list (:args)
:brew            : buffer rewind
:sp fred.txt     : open fred.txt into a split
:sball,:sb       : Split all buffers (super)
:scrollbind      : in each split window
:map   <F5> :ls<CR>:e # : Pressing F5 lists all buffer, just type number
:set hidden      : Allows to change buffer w/o saving current buffer
" Quick jumping between splits
map <C-J> <C-W>j<C-W>_
map <C-K> <C-W>k<C-W>_
" Recording (BEST TIP of ALL)
qq  # record to q
your complex series of commands
q   # end recording
@q to execute
@@ to Repeat
5@@ to Repeat 5 times
qQ@qq                             : Make an existing recording q recursive *N*
" editing a register/recording
"qp                               :display contents of register q (normal mode)
<ctrl-R>q                         :display contents of register q (insert mode)
" you can now see recording contents, edit as required
"qdd                              :put changed contacts back into q
@q                                :execute recording/register q
" Operating a Recording on a Visual BLOCK
1) define recording/register
qq:s/ to/ from/g^Mq
2) Define Visual BLOCK
3) hit : and the following appears
4)Complete as follows
:'<,'>norm @q
"combining a recording with a map (to end up in command mode)
:nnoremap ] @q:update<bar>bd
" Visual is the newest and usually the most intuitive editing mode
" Visual basics
v                               : enter visual mode
V                               : visual mode whole line
<C-V>                           : enter VISUAL BLOCK mode
gv                              : reselect last visual area (ultra)
o                               : navigate visual area
"*y or "+y                      : yank visual area into paste buffer  *C*
V%                              : visualise what you match
V}J                             : Join Visual block (great)
V}gJ                            : Join Visual block w/o adding spaces
`[v`]                           : Highlight last insert
:%s/\%Vold/new/g                : Do a substitute on last visual area *N*
" Delete first 2 characters of 10 successive lines
0<c-v>10j2ld  (use Control Q on win32) *C*
" how to copy a set of columns using VISUAL BLOCK
" visual block (AKA columnwise selection) (NOT BY ordinary v command)
<C-V> then select "column(s)" with motion commands (win32 <C-Q>)
then c,d,y,r etc
" how to overwrite a visual-block of text with another such block *C*
" move with hjkl etc
Pick the first block: ctrl-v move y
Pick the second block: ctrl-v move P <esc>
" text objects :h text-objects                                     *C*
daW                                   : delete contiguous non whitespace
di<   yi<  ci<                        : Delete/Yank/Change HTML tag contents
da<   ya<  ca<                        : Delete/Yank/Change whole HTML tag
dat   dit                             : Delete HTML tag pair
diB   daB                             : Empty a function {}
das                                   : delete a sentence
" _vimrc essentials
:set incsearch : jumps to search word as you type (annoying but excellent)
:set wildignore=*.o,*.obj,*.bak,*.exe : tab complete now ignores these
:set shiftwidth=3                     : for shift/tabbing
:set vb t_vb=".                       : set silent (no beep)
:set browsedir=buffer                 : Maki GUI File Open use current directory
" launching Win IE
:nmap ,f :update<CR>:silent !start c:\progra~1\intern~1\iexplore.exe file://%:p<CR>
:nmap ,i :update<CR>: !start c:\progra~1\intern~1\iexplore.exe <cWORD><CR>
" FTPing from VIM
cmap ,r  :Nread
cmap ,w  :Nwrite
gvim # uses netrw.vim
" appending to registers (use CAPITAL)
" yank 5 lines into "a" then add a further 5
[I     : show lines matching word under cursor <cword> (super)
" Conventional Shifting/Indenting
" visual shifting (builtin-repeat)
:vnoremap < <gv
:vnoremap > >gv
" Block shifting (magic)
" also
>% and <%
" Redirection & Paste register *
:redir @*                    : redirect commands to paste buffer
:redir END                   : end redirect
:redir >> out.txt            : redirect to a file
" Working with Paste buffer
"*yy                         : yank curent line to paste
"*p                          : insert from paste buffer
" yank to paste buffer (ex mode)
:'a,'by*                     : Yank range into paste
:%y*                         : Yank whole buffer into paste
:.y*                         : Yank Current line to paster
" filter non-printable characters from the paste buffer
" useful when pasting from some gui application
:nmap <leader>p :let @* = substitute(@*,'[^[:print:]]','','g')<cr>"*p
" Re-Formatting text
gq}                          : Format a paragraph
gqap                         : Format a paragraph
ggVGgq                       : Reformat entire file
Vgq                          : current line
" break lines at 70 chars, if possible after a ;
" Operate command over multiple files
:argdo %s/foo/bar/e          : operate on all files in :args
:bufdo %s/foo/bar/e
:windo %s/foo/bar/e
:argdo exe '%!sort'|w!       : include an external command
:bufdo exe "normal @q" | w   : perform a recording on open files
:silent bufdo !zip %:p   : zip all current files
" Command line tricks
gvim -h                    : help
ls | gvim -                : edit a stream!!
cat xx | gvim - -c "v/^\d\d\|^[3-9]/d " : filter a stream
gvim -o file1 file2        : open into a split
" execute one command after opening file
gvim.exe -c "/main" joe.c  : Open joe.c & jump to "main"
" execute multiple command on a single file
vim -c "%s/ABC/DEF/ge | update" file1.c
" execute multiple command on a group of files
vim -c "argdo %s/ABC/DEF/ge | update" *.c
" remove blocks of text from a series of files
vim -c "argdo /begin/+1,/end/-1g/^/d | update" *.c
" Automate editing of a file (Ex commands in convert.vim)
vim -s "convert.vim" file.c
#load VIM without .vimrc and plugins (clean VIM)
gvim -u NONE -U NONE -N
" Access paste buffer contents (put in a script/batch file)
gvim -c 'normal ggdG"*p' c:/aaa/xp
" print paste contents to default printer
gvim -c 's/^/\=@*/|hardcopy!|q!'
" gvim's use of external grep (win32 or *nix)
:!grep somestring *.php     : creates a list of all matching files *C*
" use :cn(ext) :cp(rev) to navigate list
:h grep
" Using vimgrep with copen                              *N*
:vimgrep /keywords/ *.php
" GVIM Difference Function (Brilliant)
gvim -d file1 file2        : vimdiff (compare differences)
dp                         : "put" difference under cursor to other file
do                         : "get" difference under cursor from other file
" complex diff parts of same file *N*
:1,2yank a | 7,8yank b
:tabedit | put a | vnew | put b
:windo diffthis 
" Vim traps
In regular expressions you must backslash + (match 1 or more)
In regular expressions you must backslash | (or)
In regular expressions you must backslash ( (group)
In regular expressions you must backslash { (count)
/fred\+/                   : matches fred/freddy but not free
/\(fred\)\{2,3}/           : note what you have to break
" \v or very magic (usually) reduces backslashing
/codes\(\n\|\s\)*where  : normal regexp
/\vcodes(\n|\s)*where   : very magic
" pulling objects onto command/search line (SUPER)
<C-R><C-W> : pull word under the cursor into a command line or search
<C-R><C-A> : pull WORD under the cursor into a command line or search
<C-R>-                  : pull small register (also insert mode)
<C-R>[0-9a-z]           : pull named registers (also insert mode)
<C-R>%                  : pull file name (also #) (also insert mode)
<C-R>=somevar           : pull contents of a variable (eg :let sray="ray[0-9]")
" List your Registers
:reg             : display contents of all registers
:reg a           : display content of register a
:reg 12a         : display content of registers 1,2 & a *N*
"5p              : retrieve 5th "ring" 
"1p....          : retrieve numeric registers one by one
:let @y='yy@"'   : pre-loading registers (put in .vimrc)
qqq              : empty register "q"
qaq              : empty register "a"
:reg .-/%:*"     : the seven special registers *N*
:reg 0           : what you last yanked, not affected by a delete *N*
"_dd             : Delete to blackhole register "_ , don't affect any register *N*
" manipulating registers
:let @a=@_              : clear register a
:let @a=""              : clear register a
:let @a=@"              : Save unnamed register *N*
:let @*=@a              : copy register a to paste buffer
:let @*=@:              : copy last command to paste buffer
:let @*=@/              : copy last search to paste buffer
:let @*=@%              : copy current filename to paste buffer
" help for help (USE TAB)
:h quickref             : VIM Quick Reference Sheet (ultra)
:h tips                 : Vim's own Tips Help
:h visual<C-D><tab>     : obtain  list of all visual help topics
                        : Then use tab to step thru them
:h ctrl<C-D>            : list help of all control keys
:helpg uganda           : grep HELP Files use :cn, :cp to find next
:helpgrep edit.*director: grep help using regexp
:h :r                   : help for :ex command
:h CTRL-R               : normal mode
:h /\r                  : what's \r in a regexp (matches a <CR>)
:h \\zs                 : double up backslash to find \zs in help
:h i_CTRL-R             : help for say <C-R> in insert mode
:h c_CTRL-R             : help for say <C-R> in command mode
:h v_CTRL-V             : visual mode
:h tutor                : VIM Tutor
<C-[>, <C-T>            : Move back & Forth in HELP History
gvim -h                 : VIM Command Line Help
:cabbrev h tab h        : open help in a tab *N*
" where was an option set
:scriptnames            : list all plugins, _vimrcs loaded (super)
:verbose set history?   : reveals value of history and where set
:function               : list functions
:func SearchCompl       : List particular function
" making your own VIM help
:helptags /vim/vim64/doc  : rebuild all *.txt help files in /doc
:help add-local-help
" running file thru an external program (eg php)
map   <f9>   :w<CR>:!c:/php/php.exe %<CR>
map   <f2>   :w<CR>:!perl -c %<CR>
" capturing output of current script in a separate buffer
:new | r!perl #                   : opens new buffer,read other buffer
:new! x.out | r!perl #            : same with named file
" create a new buffer, paste a register "q" into it, then sort new buffer
:new +put q|%!sort
" Inserting DOS Carriage Returns
:%s/$/\<C-V><C-M>&/g          :  that's what you type
:%s/$/\<C-Q><C-M>&/g          :  for Win32
:%s/$/\^M&/g                  :  what you'll see where ^M is ONE character
" automatically delete trailing Dos-returns,whitespace
autocmd BufRead * silent! %s/[\r \t]\+$//
autocmd BufEnter *.php :%s/[ \t\r]\+$//e
" perform an action on a particular file or file type
autocmd VimEnter c:/intranet/note011.txt normal! ggVGg?
autocmd FileType *.pl exec('set fileformats=unix')
" Retrieving last command line command for copy & pasting into text
" Retrieving last Search Command for copy & pasting into text
" more completions
<C-X><C-F>                        :insert name of a file in current directory
" Substituting a Visual area
" select visual area as usual (:h visual) then type :s/Emacs/Vim/ etc
:'<,'>s/Emacs/Vim/g               : REMEMBER you dont type the '<.'>
gv                                : Re-select the previous visual area (ULTRA)
" inserting line number into file
:g/^/exec "s/^/".strpart(line(".")."    ", 0, 4)
:%s/^/\=strpart(line(".")."     ", 0, 5)
:%s/^/\=line('.'). ' '
#numbering lines VIM way
:set number                       : show line numbers
:map <F12> :set number!<CR>       : Show linenumbers flip-flop
:%s/^/\=strpart(line('.')."        ",0,&ts)
#numbering lines (need Perl on PC) starting from arbitrary number
:'a,'b!perl -pne 'BEGIN{$a=223} substr($_,2,0)=$a++'
#Produce a list of numbers
#Type in number on line say 223 in an empty file
qqmnYP`n^Aq                       : in recording q repeat with @q
" increment existing numbers to end of file (type <c-a> as 5 characters)
:.,$g/^\d/exe "normal! \<c-a>"
" advanced incrementing
" advanced incrementing (really useful)
" put following in _vimrc
let g:I=0
function! INC(increment)
let g:I =g:I + a:increment
return g:I
" eg create list starting from 223 incrementing by 5 between markers a,b
:let I=223
" create a map for INC
cab viminc :let I=223 \| 'a,'bs/$/\=INC(5)/
" generate a list of numbers  23-64
" editing/moving within current insert (Really useful)
<C-U>                             : delete all entered
<C-W>                             : delete last word
<HOME><END>                       : beginning/end of line
<C-LEFTARROW><C-RIGHTARROW>       : jump one word backwards/forwards
<C-X><C-E>,<C-X><C-Y>             : scroll while staying put in insert
#encryption (use with care: DON'T FORGET your KEY)
:X                                : you will be prompted for a key
:h :X
" modeline (make a file readonly etc) must be in first/last 5 lines
// vim:noai:ts=2:sw=4:readonly:
" vim:ft=html:                    : says use HTML Syntax highlighting
:h modeline
" Creating your own GUI Toolbar entry
amenu  Modeline.Insert\ a\ VIM\ modeline <Esc><Esc>ggOvim:ff=unix ts=4 ss=4<CR>vim60:fdm=marker<esc>gg
" A function to save word under cursor to a file
function! SaveWord()
   normal yiw
   exe ':!echo '.@0.' >> word.txt'
map ,p :call SaveWord()
" function to delete duplicate lines
function! Del()
 if getline(".") == getline(line(".") - 1)
   norm dd

:g/^/ call Del()
" Digraphs (non alpha-numerics)
:digraphs                         : display table
:h dig                            : help
i<C-K>e'                          : enters é
i<C-V>233                         : enters é (Unix)
i<C-Q>233                         : enters é (Win32)
ga                                : View hex value of any character
#Deleting non-ascii characters (some invisible)
:%s/[\x00-\x1f\x80-\xff]/ /g      : type this as you see it
:%s/[<C-V>128-<C-V>255]//gi       : where you have to type the Control-V
:%s/[€-ÿ]//gi                     : Should see a black square & a dotted y
:%s/[<C-V>128-<C-V>255<C-V>01-<C-V>31]//gi : All pesky non-asciis
:exec "norm /[\x00-\x1f\x80-\xff]/"        : same thing
#Pull a non-ascii character onto search bar
yl/<C-R>"                         :
/[^a-zA-Z0-9_[:space:][:punct:]]  : search for all non-ascii
" All file completions grouped (for example main_c.c)
:e main_<tab>                     : tab completes
gf                                : open file under cursor  (normal)
main_<C-X><C-F>                   : include NAME of file in text (insert mode)
" Complex Vim
" swap two words
:%s/\<\(on\|off\)\>/\=strpart("offon", 3 * ("off" == submatch(0)), 3)/g
" swap two words
:vnoremap <C-X> <Esc>`.``gvP``P
" Swap word with next word
nmap <silent> gw    "_yiw:s/\(\%#\w\+\)\(\_W\+\)\(\w\+\)/\3\2\1/<cr><c-o><c-l> *N*
" Convert Text File to HTML
:runtime! syntax/2html.vim        : convert txt to html
:h 2html
" VIM has internal grep
:grep some_keyword *.c            : get list of all c-files containing keyword
:cn                               : go to next occurrence
" Force Syntax coloring for a file that has no extension .pl
:set syntax=perl
" Remove syntax coloring (useful for all sorts of reasons)
:set syntax off
" change coloring scheme (any file in ~vim/vim??/colors)
:colorscheme blue
" Force HTML Syntax highlighting by using a modeline
# vim:ft=html:
" Force syntax automatically (for a file with non-standard extension)
au BufRead,BufNewFile */Content.IE?/* setfiletype html
:set noma (non modifiable)        : Prevents modifications
:set ro (Read Only)               : Protect a file from unintentional writes
" Sessions (Open a set of files)
gvim file1.c file2.c lib/lib.h lib/lib2.h : load files for "session"
:mksession                        : Make a Session file (default Session.vim)
gvim -S Session.vim               : Reload all files
#tags (jumping to subroutines/functions)
taglist.vim                       : popular plugin
:Tlist                            : display Tags (list of functions)
<C-]>                             : jump to function under cursor
" columnise a csv file for display only as may crop wide columns
:let width = 20
:let fill=' ' | while strlen(fill) < width | let fill=fill.fill | endwhile
:%s/\([^;]*\);\=/\=strpart(submatch(1).fill, 0, width)/ge
" Highlight a particular csv column (put in .vimrc)
function! CSVH(x)
    execute 'match Keyword /^\([^,]*,\)\{'.a:x.'}\zs[^,]*/'
    execute 'normal ^'.a:x.'f,'
command! -nargs=1 Csv :call CSVH(<args>)
" call with
:Csv 5                             : highlight fifth column
zf1G      : fold everything before this line *N*
" folding : hide sections to allow easier comparisons
zf}                               : fold paragraph using motion
v}zf                              : fold paragraph using visual
zf'a                              : fold to mark
zo                                : open fold
zc                                : re-close fold
:help folding
zfG      : fold everything after this line *N*
" displaying "non-asciis"
:set list
:h listchars
" How to paste "normal commands" w/o entering insert mode
:norm qqy$jq
" manipulating file names
:h filename-modifiers             : help
:w %                              : write to current file name
:w %:r.cfm                        : change file extention to .cfm
:!echo %:p                        : full path & file name
:!echo %:p:h                      : full path only
:!echo %:t                        : filename only
:reg %                            : display filename
<C-R>%                            : insert filename (insert mode)
"%p                               : insert filename (normal mode)
/<C-R>%                           : Search for file name in text
" delete without destroying default buffer contents
"_d                               : what you've ALWAYS wanted
"_dw                              : eg delete word (use blackhole)
" pull full path name into paste buffer for attachment to email etc
nnoremap <F2> :let @*=expand("%:p")<cr> :unix
nnoremap <F2> :let @*=substitute(expand("%:p"), "/", "\\", "g")<cr> :win32
" Simple Shell script to rename files w/o leaving vim
$ vim
:r! ls *.c
:%s/\(.*\).c/mv & \1.bla
:w !sh
" count words/lines in a text file
g<C-G>                                 # counts words
:echo line("'b")-line("'a")            # count lines between markers a and b *N*
:'a,'bs/^//n                           # count lines between markers a and b
:'a,'bs/somestring//gn                 # count occurences of a string
" example of setting your own highlighting
:syn match DoubleSpace "  "
:hi def DoubleSpace guibg=#e0e0e0
" reproduce previous line word by word
imap ]  @@@<ESC>hhkyWjl?@@@<CR>P/@@@<CR>3s
nmap ] i@@@<ESC>hhkyWjl?@@@<CR>P/@@@<CR>3s
" Programming keys depending on file type
:autocmd bufenter *.tex map <F1> :!latex %<CR>
:autocmd bufenter *.tex map <F2> :!xdvi -hush %<.dvi&<CR>
" reading Ms-Word documents, requires antiword
:autocmd BufReadPre *.doc set ro
:autocmd BufReadPre *.doc set hlsearch!
:autocmd BufReadPost *.doc %!antiword "%"
" a folding method
vim: filetype=help foldmethod=marker foldmarker=<<<,>>>
A really big section closed with a tag <<< 
--- remember folds can be nested --- 
Closing tag >>> 
" Return to last edit position (You want this!) *N*
autocmd BufReadPost *
     \ if line("'\"") > 0 && line("'\"") <= line("$") |
     \   exe "normal! g`\"" |
     \ endif
" store text that is to be changed or deleted in register a
"act<                                 :  Change Till < *N*
# using gVIM with Cygwin on a Windows PC
if has('win32')
source $VIMRUNTIME/mswin.vim
behave mswin
set shell=c:\\cygwin\\bin\\bash.exe shellcmdflag=-c shellxquote=\"
" Just Another Vim Hacker JAVH
vim -c ":%s%s*%Cyrnfr)fcbafbe[Oenz(Zbbyranne%|:%s)[[()])-)Ig|norm Vg?"
"Read Vimtips into a new vim buffer (needs
:tabe | :r ! w3m -dump    *N*
updated version at
Please email any errors, tips etc to
" Information Sources
Vim Wiki *** VERY GOOD *** *N*
Vim Use VIM newsgroup *N*
comp.editors "VIM" specific newsgroup
VIM Webring
Vim Book
Searchable VIM Doc
VimTips PDF Version (PRINTABLE!)
" : commands to neutralise < for HTML display and publish
" use yy@" to execute following commands
:w!|sav! vimtips.html|:/^__BEGIN__/,/^__END__/s#<#\<#g|:w!|:!vimtipsftp


Wednesday, January 5, 2011

Using diff and patch

SkyHi @ Wednesday, January 05, 2011
The diff and patch utilites can be intimidating to the newcomer, but they are not all that difficult to use, even for the non-programmer. If you are at all familiar with makefiles, you might find yourself frequently wanting to patch a file, either to correct an error that you've found or to add something that you need to the makefile. After I began using the mrxvt terminal, I wanted to give it Japanese capability. My main O/S is FreeBSD. It manages packages with its ports and package system. To install a package from a port, one uses the port's Makefile, which will download and compile the souce code, in a manner familiar to those who use Gentoo's portage (which was inspired by FreeBSD ports) or ArchLinux's makepkg.
In this case, I wanted to edit the port's Makefile to enable Japanese support.
To do this, I simply had to add a line to the Makefile.
CONFIGURE_ARGS+=   --enable-xim --enable-cjk --with-encoding=eucj 
Ok, this is simple. However, after doing this, I thought that perhaps I should submit a patch to the port's maintainer, giving others the opportunity to include Japanese support. This was a little more complicated, because the change to the Makefile meant that I should include a message when they installed the port, telling them what to do if they wished to include Japanese capability.
So, I had to add the following lines, in their proper place in the port's Makefile.
.if defined(WITH_JAPANESE)
 CONFIGURE_ARGS+=   --enable-xim --enable-cjk --with-encoding=eucj 
  @${ECHO_MSG} "=========================================>"
 @${ECHO_MSG} "For Japanese support use make -DWITH_JAPANESE install"
  @${ECHO_MSG} "=========================================>"
I created my new Makefile, and then, using the diff command, created a patch.
diff -uN Makefile > patch.Makefile
The diff command has various flags. Simply doing a diff between two files shows something like (I'm just showing a few lines here)
< # $FreeBSD: /repoman/r/pcvs/ports/x11/mrxvt/Makefile,v 1.7 2005/07/22 22:38:58 pav Exp $
> # $FreeBSD: ports/x11/mrxvt/Makefile,v 1.7 2005/07/22 22:38:58 pav Exp $
> .if defined(WITH_JAPANESE)
> CONFIGURE_ARGS+=   --enable-xim --enable-cjk --with-encoding=eucj 
> .endif # WITH_JAPANESE
In this simple example, you can probably figure out its meaning. The < refers to lines in the original file that aren't in the new one and > refers to lines in the new file that aren't in the old one. The 5c5 means that there is a difference in the 5th line. The c means something would have to be changed for them to match. The 25a30,37 means that text would be added at line 25. In this case, we don't have it, but there is also use of the letter d for text to be deleted.
This is a bit hard to read, especially if there are many differences. Therefore, most people prefer unified diffs, diff with the -u flag. This gives us something like (again, with many lines snipped)
--- Makefile.orig Sat Sep 10 17:16:53 2005
+++ Makefile Fri Sep 16 03:13:52 2005
@@ -20,9 +20,21 @@
+.if defined(WITH_JAPANESE)
+CONFIGURE_ARGS+=   --enable-xim --enable-cjk --with-encoding=eucj 
This shows a few lines before and after the change, which helps define context. (I've snipped the lines below this change, but you can see that three lines above it are included.) Let's examine this a bit. The first lines are fairly straightforward, they have --- and the old file's name, then +++ and the name of the new file. It also contains the ctime (the time the file was last modified.) Next is what is known as the hunk. This line will start with @@ then have the old file's starting line, the old number of lines, the new start and the new number of lines, then another @@.
Understand that the three lines above and below the change remain as they are. The 3 lines are simply to give context. In this case, including that context, the change starts at line 20. Lines 20-23 will remain unchanged. Including the 3 lines above and below the differences, the change will go for 9 lines. So, we are changing 9 lines, starting from line 20, (which will include 3 lines above and 3 lines below the actual change). Therefore, this is shown with a minus sign.
Following that is the plus sign. The first number 20, is the first line of the new file and the change, including the 3 lines above and below, will continue for 21 lines. Note that I have not shown the entire patch and also some of those lines may simply be blank lines. So, the hunk starts with
@@ -20,9 +20,21 @@
Next comes the actual patch itself, the 3 lines of context and the change.
Note that in the patch, there is a space before the 3 lines of context, and then the lines below have a plus sign. A space before a line means that nothing will be changed. A plus sign means the line will be added. If there had been lines to be deleted, they would have had a minus sign in front of them.
Let's create two files to make this a little clearer. Using your favorite text editor, create patchtest.txt and patchtestnew.txt. The patchtest.txt will read
This is a file.
These first three lines are 
lines of context. They 
will remain unchanged. They will have spaces in front of them.
Here are the lines that will be changed.  They will begin with
minus signs, because they are being deleted.
Now, we will add three 
more lines that are only
context.  They will have spaces in the patch
Now, patchtest1.txt
This is a file.
These first three lines are 
lines of context. They 
will remain unchanged. They will have spaces in front of them.
These lines have been changed.  They will have plus
signs in front of them.
Now, we will add three 
more lines that are only
context.  They will have spaces in the patch
Create the patch.
patch -uN patchtest.txt patchtest1.txt > patch.txt
View the patch
less patch.txt
You will see
--- patchtest Sun Feb 26 19:35:43 2006
+++ patchtest1 Sun Feb 26 19:35:14 2006
@@ -2,8 +2,9 @@
 These first three lines are 
 lines of context. They 
 will remain unchanged. They will have spaces in front of them.
-Here are the lines that will be changed.  They will begin with
-minus signs, because they are being deleted.
+These lines have been changed.  They will have plus
+signs in front of them.
 Now, we will add three 
 more lines that are only
 context.  They will have spaces in the patch
+This is yet another line that is different.
You can see the first line, This is a file, wasn't included in the patch--that's because it was outside of the three lines of context.
Now that we've made our patch, we can apply it.
patch patchtest.txt < patch.txt
You will see
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
|--- patchtest Sun Feb 26 19:35:43 2006
|+++ patchtest1 Sun Feb 26 19:35:14 2006
Patching file patchtest using Plan A...
Hunk #1 succeeded at 2.
Now, patchtest.txt has been patched. If you now do a diff between patchtest and patchtest1, you'll just be put back at your command prompt, showing that there are no differences.
This is simplest form of creating diffs and using patches. Sometimes, you patch an entire directory--those who compile their own kernels may have done this. Rather than downloading an entire new tarball of the new kernel, there are often patches, especially of minor revision numbers. The README in /usr/src/linux has instructions for using these patches. When you are applying a patch to an entire directory tree, you may need to use the -p1 option. The p[number] basically helps determine the path to the file or files being patched. See man(1) patch for details and examples. For instance, if you had a patch for the entire Linux kernel source tree, and were in /usr/src you might do
patch -p1 < mylinuxsource.patch
As this varies, depending not only upon your location when applying the patch, but also what is in the patch, it's best to see the man page, however, just keep in mind that if trying to apply a patch that covers several files in a directory doesn't work, it may be the p[number] that is causing the difficulty.
Although this becomes more complex when making patches consisting of many hunks, or patching many files in a directory, (such as the Linux kernel source tree) the basic concept is the same. It is hoped that this article gives the reader a better understanding of diff and patch, and will help them to read and understand patches. This can be very handy--sometimes, a patch has something you don't want, so it' always good to look at it before applying it.
Patches can also be reversed with the -R flag. Suppose you try an experimental patch and it breaks something. You can then patch the file again with the -R flag.
Take our patchtest and patchtest1. Let's run patch again with the -R flag.
patch -R patchtest.txt < patch.txt
Again, you'll see that Hrrm...Looks like a unified diff to me message and a message that it succeeded. Actually, if you forget the -R flag, patch often catches it. Patch the file one more time with patch patchtest.txt < patch.txt and it should succeed. Once again, patchtest and patchtest1 are identical. Now, try it again, without the -R flag. You'll see a message
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
|--- patchtest Sun Feb 26 19:35:43 2006
|+++ patchtest1 Sun Feb 26 19:35:14 2006
Patching file patchtest using Plan A...
Reversed (or previously applied) patch detected!  Assume -R? [y] 
If you type y then you should once again see that it succeeded.
If anyone is interested, my patch for mrxvt was accepted, and the port is now available with the option to enable Japanese.


ln -s d1 d2 # Am I the only person who gets this the wrong way round every fucking time?

SkyHi @ Wednesday, January 05, 2011
I used to get it wrong, too. Now I remember it together with cp:

cp existing new
ln -s existing new

p source destination
mv source destination
ln [-s] source destination

user1@Januty:~/testdir$ ls -l 32.d
-rw-r--r-- 1 user1 user1 164 2011-01-04 17:07 32.d

user1@Januty:~/testdir$ ln -s 32.d 3322.dd

user1@Januty:~/testdir$ ls -tlrh 3322.dd
lrwxrwxrwx 1 user1 user1 4 2011-01-05 14:14 3322.dd -> 32.d

user1@Januty:~/testdir$ ls -tlrh 32.d
-rw-r--r-- 1 user1 user1 164 2011-01-04 17:07 32.d

user1@Januty:~/testdir$ rm 3322.dd
rm: remove symbolic link `3322.dd'? y

user1@Januty:~/testdir$ ls -tlrh 3322.dd
ls: cannot access 3322.dd: No such file or directory

user1@Januty:~/testdir$ ls -tlrh 32.d
-rw-r--r-- 1 user1 user1 164 2011-01-04 17:07 32.d


Tuesday, January 4, 2011

Password Protect a Directory or File with .htaccess htpasswd

SkyHi @ Tuesday, January 04, 2011
A tutorial explaining how to restrict access to a directory on a web server using .htaccess.

# AllowOverride controls what directives may be placed in .htaccess files.
# It can be "All", "None", or any combination of the keywords:
# Options FileInfo AuthConfig Limit

<Directory /var/www/>
AllowOverride All

Password protecting a directory can be done several ways. Many people use PHP or ASP to verify users, but if you want to protect a directory of files or images (for example), that often isn't practical. Fortunately, Apache has a built-in method for protecting directories from prying eyes, using the .htaccess file.

In order to protect your chosen directory, you will first need to create an .htaccess file. This is the file that the server will check before allowing access to anything in the same directory. That's right, the .htaccess file belongs in the directory you are protecting, and you can have one in each of as many directories as you like.

You'll need first to define a few parameters for the .htaccess file. It needs to know where to find certain information, for example a list of valid usernames and passwords. This is a sample of the few lines required in an .htaccess file to begin with, telling it where the usernames and passwords can be found, amongst other things.

AuthUserFile /full/path/to/.htpasswd 
AuthName "Please Log In" 
AuthType Basic

You've now defined a few basic parameters for Apache to manage the authorisation process. First, you've defined the location of the .htpasswd file. This is the file that contains all the usernams and encrypted passwords for your site. We'll cover adding information to this file shortly. It's extremely important that you place this file outside of the web root. You should only be able to access it by FTP, not over the web.

The AuthName parameter basically just defines the title of the password entry box when the user logs in. It's not exactly the most important part of the file, but should be defined. The AuthType tells the server what sort of processing is in use, and "Basic" is the most common and perfectly adequate for almost any purpose.

We've told apache where to find files, but we've not told it who, of those people defined in the .htpasswd file, can access the directory. For that reason, we still have another line to define.

If we want to grant access to everyone in the .htpasswd file, we can add this line ("valid-user" is like a keyword, telling apache any user will do):
require valid-user

If we want to just grant access to a single user, we can use "user" and their username instead of "valid-user":
require user dave

A normal and complete .htaccess file might look like this:
AuthUserFile /home/dave/.htpasswd 
AuthName "Dave's Login Area" 
AuthType Basic 
require user dave

Now we have almost everything defined, but we are still missing an .htpasswd file. Without that, the server won't know what usernames and passwords are ok.

An .htpasswd file is made up of a series of lines, one for each valid user. Each line looks like this, with a username, then colon, then encrypted password:

The password encryption is the same as you'll find in PHP's crypt() function. It is not reversible, so you can't find out a password from the encrypted version. (Please note that on page 2 of this article is a tool to help you generate an .htpasswd file, that will help you encrypt passwords).

A user of "dave" and password of "dave" might be added with the following line:

Each time you run an encryption function like "crypt", you will almost certainly get a different result. This is down to something called "salt", which in the above case was "XO" (first two letters of encrypted password). Different salt will give different encrypted values, and if not explicitly specified will be randomly generated. Don't worry though, the server is quite capable of understanding all this - if you come up with a different value for the encrypted password and replace it, everything would still work fine, as long as the password was the same.

Once you've created your .htpasswd file, you need to upload it to a safe location on your server, and check you've set the .htaccess file to point to it correctly. Then, upload the .htaccess file to the directory you want to protect and you'll be all set. Simply visit the directory to check it is all working.
.htpasswd Generator

The .htpasswd file needs encrypted passwords, which can be a problem for anyone without experience with a programming language. For that reason, I've created this simple tool, which, if you enter the username and password you wish to use, will generate the appropriate line to add to your .htpasswd file.
#-c Create the passwdfile for the first time
#-s SHA encryption

$ htpasswd -sc /var/www/html/ cgsociety

To prohibit downloading of your .htpasswd & associated files your .htaccess should look something like this:

AuthType Basic 
AuthName "Authorization Required" 
AuthUserFile /path/to/.htpasswd 

<files ~ "^\.ht"> 
Order allow,deny 
Deny from all 


<Limit GET POST> 
require user username 


iPasswd - .htpasswd password generator

iPasswd is an online password generation tool for .htpasswd files. These files store a username and password combination (one per line of the file) which is used with .htaccess Basic Authentication. iPasswd also supports MD5 passwords, used in Digest Authentication.

* Enter a username below. (Note: the username is case sensitive.)
* Enter a password below (Note: the password is case sensitive.)
* If your server uses MD5 password encryption, select the checkbox. If you do not know, leave this unselected.
* Click the "Generate" button.
* Your username and encrypted password will be displayed in a new window.
* Copy and paste the username:password into your .htpasswd file (using a text editor). You can have as many usernames as you want. Each username:password pair must be listed on its own line in the .htpasswd file.
* Upload the .htaccess file to your web site. It should be located in a directory which is not accessable via a web browser (not your "public_html" directory).

User Name:
Use MD5 password encryption
Create .htaccess file

The .htaccess file tells the web server whether to use password protection on files that it is serving.

You'll need to know the full path to your .htaccess file. This is probably something like:


Ask your web host if you do not know the full path to your home directory.

Upload the .htaccess file into the directory you want to protect, or the directory containing the files you want to protect.

You should use one of the following, depending on whether you want to protect all files or just some files in a directory. Replace /full/path/to/.htpasswd with actual pathname of your .htpasswd file. Replace "Please Login" with the message you want displayed when visitors are prompted for a password.
Directory Protection
To protect all files in a directory use a .htaccess file like this:

<Files ".ht*">
order allow,deny
deny from all

AuthUserFile /full/path/to/.htpasswd
AuthType Basic
AuthName "Please Login"

require valid-user

File Protection

To protect only certain files in a directory, use a .htaccess file like this.
<Files ".ht*">
order allow,deny
deny from all

<Files private1.html private2.html>
AuthUserFile /home/pathto/.htpasswd
AuthType Basic
AuthName "Please Login"
Require valid-user
You can use * wildcards in file names to match multiple files.

If you are familiar with regular expressions, you can replace the Files directive with FilesMatch.

For complete specs of .htaccess password protection see the Apace mod_auth documentation

Basic passwords are easy to reverse, meaning that if someone gets hold of your .htpasswd file, they can find out your password. MD5 passwords do not suffer from this, so if your web host supports it, using MD5 passwords will provide protection against this. To do so, change the "AuthType Basic" to "AuthType Digest" in the .htaccess file and use MD5 passwords in your .htpasswd file.


TIME_WAIT in netstat

SkyHi @ Tuesday, January 04, 2011

So, ever wonder what all those TIME_WAITs are doing in your netstat listing?

Okay, for those of you who don't spend all your waking hours fooling around with Web servers, let me back up a little and explain what that sentence meant.

Netstat is a little utility that many administrators use to monitor the network connections on their servers. It is quite useful for tracking down that small subset of performance bottlenecks that aren't attributable to yet another piece of convoluted application code that some careless programmer wrote and now you have to take care of. But I digress.

When you run netstat on your busy IIS box, you might get something that looks like this:

C:\>netstat -np tcp

Active Connections

Proto Local Address Foreign Address State


And so on and on, for many, many lines. Each line here represents a connection between a TCP socket your server and a matching one on some other machine--usually an HTTP client such as a browser or proxy server, but depending on your architecture you might also see connections to other kinds of servers (database, application, directory, etc.). Each connection has a unique combination of IP addresses and port numbers that identify the endpoints to which the sockets are bound. More to the point, each one also has a state indicator. As connections are set up used and torn down, they pass through a variety of these states, most of which aren't shown here, because they come and go quite quickly).

The connections in the ESTABLISHED state are, well, established--they are neither being set up nor torn down but just used. This is what you will often see the most of. But what about the others? On a busy HTTP server, the number of sockets in this TIME_WAIT state can far exceed those in the ESTABLISHED state. For instance, I checked an IIS 6.0 box that serves a fairly busy corporate site earlier today and got 124 ESTABLISHED connections versus 431 in TIME_WAIT.

What does this all mean? More importantly, is it something you should be worried about?

The answers are:

1. It's complicated.

2. Maybe.

To understand what all those TIME_WAITs are doing there, it's useful to review (or learn) a little TCP. I'll wait here while you brush up on RFC793.

That was fast. Just kidding. The bit you need to know is so simple, even I can explain it.

As you know, TCP provides a reliable connection between two endpoints, across which data can be sent in segmented form. As part of this, TCP also provides a mechanism for gracefully shutting down such connections. This is accomplished with a full duplex handshake, which can be diagrammed like so:

Server Client

-------------- FIN -------------->

<------------- ACK ---------------

<------------- FIN ---------------

-------------- ACK ------------->

As you can see by this very sophisticated diagram, a graceful shutdown requires the two endpoints to exchange some TCP/IP packets with the FIN and ACK bits set, in a certain sequence. This exchange of packets in turn corresponds to certain state changes on each side of the connection. In the diagram, I've labeled the two sides "Server" and "Client" such that the sequence of events mirrors what usually happens when connections are closed by HTTP.

Here is what happens, step-by-step:

1. First the application at one endpoint--in this example, that would be the Web server--initiates what is called an "active close." The Web server itself is now done with the connection, but the TCP implementation that supplied the socket it was using still has some work to do. It sends a FIN to the other endpoint and goes into a state called FIN_WAIT_1.

2. Next the TCP endpoint on the browser's side of the connection acknowledges the server's FIN by sending back an ACK, and goes into a state called CLOSE_WAIT. When the server side receives this ACK, it switches to a state called FIN_WAIT_2. The connection is now half-closed.

3. At this point, the socket on the client side is in a "passive close," meaning it waits for the application that was using it (the browser) to close. When this happens, the client sends its own FIN to the server, and deallocates the socket on the client side. It's done.

4. When the server gets that last FIN, it of course sends back on ACK to acknowledge it, and then goes into the infamous TIME_WAIT state. For how long? Ah, there's the rub.

The socket that initiated the close is supposed to stay in this state for twice the Maximum Segment Lifetime--2MLS in geek speak. The MLS is supposed to be the length of time a TCP segment can stay alive in the network. So, 2MLS makes sure that any segments still out there when the close starts have time to arrive and be discarded. Why bother with this, you ask?

Because of delayed duplicates, that's why. Given the nature of TCP/IP, it's possible that, after an active close has commenced, there are still duplicate packets running around, trying desperately to make their way to their destination sockets. If a new socket binds to the same IP/port combination before these old packets have had time to get flushed out of the network, old and new data could become intermixed. Imagine the havoc this could cause around the office: "You got JavaScript in my JPEG!"

So, TIME_WAIT was invented to keep new connections from being haunted by the ghosts of connections past. That seems like a good thing. So what's the problem?

The problem is that 2MLS happens to be a rather long time--240 seconds, by default. There are several costs associated with this. The state for each socket is maintained in a data structure called a TCP Control Block (TCB). When IP packets come in they have to be associated with the right TCB and the more TCBs there are, the longer that search takes. Modern implementations of TCP combat this by using a hash table instead of a linear search. Also, since each TIME_WAIT ties up an IP/port combination, too many of them can lead to exhaustion of the default number of ephemeral ports available for handling new requests. And even if the TCB search is relatively fast, and even if there are plenty of ports to bind to, the extra TCBs still take up memory on the server side. In short, the need to limit the costs of TIME_WAIT turns out to be a long-standing problem. In fact, this was part of the original case for persistent connections in HTTP 1.1.

The good news is that you can address this problem by shortening the TIME_WAIT interval. This article by Brett Hill explains how to do so for IIS. As Brett explains, four minutes is probably longer than needed for duplicate packets to flush out of the network, given that modern network latencies tend to be much shorter than that. The bad news is that, while shortening the interval is quite common, it still entails risks. As Faber, Touch and Yue (who are the real experts on this) explain: "The size of the MSL to maintain a given memory usage level is inversely proportional to the connection rate." In other words, the more you find yourself needing to reduce the length of TIME_WAIT, the more likely doing so will cause problems.


PHP encode and decode hex

SkyHi @ Tuesday, January 04, 2011

$text = "php rocks!";
$encoded = preg_replace(
echo "ENCODED: $encoded\n";

//ENCODED: 50485020726f636b7321

echo "DECODED: ".preg_replace(

//DECODED: php rocks!


Sunday, January 2, 2011

How to control Windows 7 snap feature with two monitors?

SkyHi @ Sunday, January 02, 2011

You should use only shortcuts for that configuration

(mouse movements alone seems not supported for that special side that crosses over to the other monitor)

NOTE: On multiple monitors, continually pressing a keyboard shortcut combination below will cause the window to snap to the side and scroll in that direction through the multiple monitor screens.

  • Windows Key + Left Arrow Key

    This will snap the current window to the left side of the screen, or unsnap a window that is snapped to the right side of the screen.

  • Windows Key + Right Arrow Key

    This will snap the current window to the right side of the screen, or unsnap a window that is snapped to the left side of the screen.