Friday, December 31, 2010

Generic Input Sanitizer PHP 5.2 Greater

SkyHi @ Friday, December 31, 2010
With the ever increasing attacks on websites for place malware links and site defacement a programmer must be ready. Also many times these attacks are on older systems that need to be supported. I have developed a PHP 5.x approach to this. The code is a block of code that can be added at the top of your script. If you have a special need to sanitize the input then add the form field name or query string field name into the array and let the script do the rest.
Here is the code:


<?php
# Add the Post or Get fields coming in to specify filter.
# Default: filter string
$filters = array(
  'my_text'       =>  'string',
  'my_email'      =>  'email',
  'my_url'        =>  'url',
  'my_chars'      =>  'special',
  'my_int'        =>  'int',
  'my_float'      =>  'float',
  'my_encoded'    =>  'encoded'
);
 
foreach($_POST as $key=>$value){
 
  if(array_key_exists($key, $filters)){
  switch ($filters[$key]){
  case 'string':
  $_POST[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_STRING);
  $_REQUEST[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_STRING);
  break;
   
  case 'email':
  $_POST[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_EMAIL);
  $_REQUEST[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_EMAIL);
  break;
   
  case 'url':
  $_POST[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_URL);
  $_REQUEST[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_URL);
  break;
   
  case 'special':
  $_POST[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_SPECIAL_CHARS);
  $_REQUEST[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_SPECIAL_CHARS);
  break;
   
  case 'int':
  $_POST[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_INT);
  $_REQUEST[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_INT);
  break;
   
  case 'float':
  $_POST[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_FLOAT);
  $_REQUEST[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_FLOAT);
  break;
   
  case 'encoded':
  $_POST[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_ENCODED);
  $_REQUEST[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_ENCODED);
  break;
   
  default :
  $_POST[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_STRING);
  $_REQUEST[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_STRING);
  }
  } else {
  $_POST[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_STRING);
  $_REQUEST[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_STRING);
  }
 
}
 
foreach($_GET as $key=>$value){
 
  if(array_key_exists($key, $filters)){
  switch ($filters[$key]){
  case 'string':
  $_GET[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_STRING);
  $_REQUEST[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_STRING);
  break;
   
  case 'email':
  $_GET[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_EMAIL);
  $_REQUEST[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_EMAIL);
  break;
   
  case 'url':
  $_GET[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_URL);
  $_REQUEST[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_URL);
  break;
   
  case 'special':
  $_GET[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_SPECIAL_CHARS);
  $_REQUEST[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_SPECIAL_CHARS);
  break;
   
  case 'int':
  $_GET[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_INT);
  $_REQUEST[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_INT);
  break;
   
  case 'float':
  $_GET[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_FLOAT);
  $_REQUEST[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_FLOAT);
  break;
   
  case 'encoded':
  $_GET[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_ENCODED);
  $_REQUEST[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_ENCODED);
  break;
   
  default :
  $_GET[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_STRING);
  $_REQUEST[$key] = filter_input(INPUT_POST, $key, FILTER_SANITIZE_STRING);
  }
  } else {
  $_GET[$key] = filter_input(INPUT_GET, $key, FILTER_SANITIZE_STRING);
  $_REQUEST[$key] = filter_input(INPUT_GET, $key, FILTER_SANITIZE_STRING);
  }
}

?>


REFERENCES
http://scovol.net/2010/02/12/generic-input-sanitizer/

Difference between 'Uplink' and Port Speed

SkyHi @ Friday, December 31, 2010
Port Speed refers the speed of data transferring in the hardware, and
UpLink is related to the speed of data transferring fro client side to
server, whose speed can be controlled by the administrator.


REFERENCES
http://serverfault.com/questions/217702/difference-between-uplink-and-port-speed

What is better LVM on RAID or RAID on LVM?

SkyHi @ Friday, December 31, 2010

QUESTION:

I currently have LVM on software RAID, but I'd like to ask you what you think it is better solution, maybe some pros and cons?



Edit: It is about software raid on lvm or lvm on software raid. I know than hardware raid is better if we are thinking about performance.

ANSWER:

1.You're current setup is like this:



<code>| / | /var | /usr | /home  |
--------------------------
| LVM Volume |
--------------------------
| RAID Volume |
--------------------------
| Disk 1 | Disk 2 | Disk 3 |
</code>


It's a much simpler setup with more flexibility. You can use all of the disks in the RAID volume and slice and dice them whatever way you like with LVM. The other way isn't even worth thinking about - it's ridiculously complicated and you lose the benefits of LVM at the filesystem level.



If you tried to RAID LVM volumes, you're left with a normal device without any of the LVM volume benefits (e.g. growing filesystems etc.)


2.have hardware raid and you can have lvm on top - best combination.

3.Your current setup is fine. This is the recommended way to do it.



Raid deals with keeping the bits secure/redundant/fast/whatever and LVM helps you present them in a esasy to use way.


REFERENCES
http://serverfault.com/questions/217666/what-is-better-lvm-on-raid-or-raid-on-lvm






How Do I Stop Hotlinking and Bandwidth Theft?

SkyHi @ Friday, December 31, 2010

You can stop others from hotlinking your site's files by placing a file called .htaccess in your Apache site root (main) directory. The period before the name means the file is hidden, so you may want to edit your file as htaccess.txt, upload it to your server, then rename the txt file to .htaccess in your directory. Contact your web host on how to access your directories and configure your .htaccess file.





Example: Your site url is www.mysite.com. To stop hotlinking of your images from other sites and display a replacement image called hotlink.gif from our server, place this code in your .htaccess file:



RewriteEngine On

RewriteCond %{HTTP_REFERER} !^http://(.+\.)?mysite\.com/ [NC]

RewriteCond %{HTTP_REFERER} !^$

RewriteRule .*\.(jpe?g|gif|bmp|png)$ http://img148.imageshack.us/img148/237/hotlinkp.gif [L]



The first line of the above code begins the rewrite. The second line matches any requests from your own mysite.com url. The [NC] code means "No Case", meaning match the url regardless of being in upper or lower case letters. The third line means allow empty referrals. The last line matches any files ending with the extension jpeg, jpg, gif, bmp, or png. This is then replaced by the hotlinkp.gif image from the imageshack.us server. You could easily use your own hotlink image by placing an image file in your site's directory and pointing to that file.





To stop hotlinking from specific outside domains only, such as myspace.com, blogspot.com and livejournal.com, but allow any other web site to hotlink images:



RewriteEngine On

RewriteCond %{HTTP_REFERER} ^http://(.+\.)?myspace\.com/ [NC,OR]

RewriteCond %{HTTP_REFERER} ^http://(.+\.)?blogspot\.com/ [NC,OR]

RewriteCond %{HTTP_REFERER} ^http://(.+\.)?livejournal\.com/ [NC]

RewriteRule .*\.(jpe?g|gif|bmp|png)$ http://img148.imageshack.us/img148/237/hotlinkp.gif [L]



You can add as many different domains as needed. Each RewriteCond line should end with the [NC,OR] code. NC means to ignore upper and lower case. OR means "Or Next", as in, match this domain or the next line that follows. The last domain listed omits the OR code since you want to stop matching domains after the last RewriteCond line.





You can display a 403 Forbidden error code instead of an image. Replace the last line of the previous examples with this line:



RewriteRule .*\.(jpe?g|gif|bmp|png)$ - [F]





Warning: Do not use .htaccess to redirect image hotlinks to another HTML page or server that isn't your own (such as this html page). Hotlinked images can only be replaced by other images, not with an HTML page.



As with any htaccess rewrites, you may block some legitimate traffic (such as users behind proxies or firewalls) using these techniques.

REFERENCES
http://altlab.com/hotlinking.html
http://altlab.com/htaccess_tutorial.html

robots.txt wget

SkyHi @ Friday, December 31, 2010
User-agent: *
Disallow: /gallery/
Disallow: /images/

User-agent: baiduspider
Disallow: /

User-agent: msnbot
Disallow: /

User-agent: Teoma
Disallow: /

User-agent: TurnitinBot
Disallow: /

User-agent: WISEnutbot
Disallow: /

User-agent: ZyBorg/1.0 Dead Link Checker (wn.dlc@looksmart.net; http://www.WISEnutbot.com)
Disallow: /

User-agent: (wn.dlc@looksmart.net; http://www.WISEnutbot.com)
Disallow: /

User-agent: NaverBot
Disallow: /

User-agent: BotRightHere
Disallow: /

User-agent: WebZip
Disallow: /

User-agent: larbin
Disallow: /

User-agent: b2w/0.1
Disallow: /

User-agent: Copernic
Disallow: /

User-agent: psbot
Disallow: /

User-agent: Python-urllib
Disallow: /

User-agent: NetMechanic
Disallow: /

User-agent: URL_Spider_Pro
Disallow: /

User-agent: CherryPicker
Disallow: /

User-agent: EmailCollector
Disallow: /

User-agent: EmailSiphon
Disallow: /

User-agent: WebBandit
Disallow: /

User-agent: EmailWolf
Disallow: /

User-agent: ExtractorPro
Disallow: /

User-agent: CopyRightCheck
Disallow: /

User-agent: Crescent
Disallow: /

User-agent: SiteSnagger
Disallow: /

User-agent: ProWebWalker
Disallow: /

User-agent: CheeseBot
Disallow: /

User-agent: LNSpiderguy
Disallow: /

User-agent: Alexibot
Disallow: /

User-agent: Teleport
Disallow: /

User-agent: TeleportPro
Disallow: /

User-agent: MIIxpc
Disallow: /

User-agent: Telesoft
Disallow: /

User-agent: Website Quester
Disallow: /

User-agent: WebZip
Disallow: /

User-agent: moget/2.1
Disallow: /

User-agent: WebZip/4.0
Disallow: /

User-agent: WebStripper
Disallow: /

User-agent: WebSauger
Disallow: /

User-agent: WebCopier
Disallow: /

User-agent: NetAnts
Disallow: /

User-agent: Mister PiX
Disallow: /

User-agent: WebAuto
Disallow: /

User-agent: TheNomad
Disallow: /

User-agent: WWW-Collector-E
Disallow: /

User-agent: RMA
Disallow: /

User-agent: libWeb/clsHTTP
Disallow: /

User-agent: asterias
Disallow: /

User-agent: httplib
Disallow: /

User-agent: turingos
Disallow: /

User-agent: spanner
Disallow: /

User-agent: InfoNaviRobot
Disallow: /

User-agent: Harvest/1.5
Disallow: /

User-agent: Bullseye/1.0
Disallow: /

User-agent: Mozilla/4.0 (compatible; BullsEye; Windows 95)
Disallow: /

User-agent: Crescent Internet ToolPak HTTP OLE Control v.1.0
Disallow: /

User-agent: CherryPickerSE/1.0
Disallow: /

User-agent: CherryPickerElite/1.0
Disallow: /

User-agent: WebBandit/3.50
Disallow: /

User-agent: NICErsPRO
Disallow: /

User-agent: Microsoft URL Control - 5.01.4511
Disallow: /

User-agent: DittoSpyder
Disallow: /

User-agent: Foobot
Disallow: /

User-agent: SpankBot
Disallow: /

User-agent: BotALot
Disallow: /

User-agent: lwp-trivial/1.34
Disallow: /

User-agent: lwp-trivial
Disallow: /

User-agent: BunnySlippers
Disallow: /

User-agent: Microsoft URL Control - 6.00.8169
Disallow: /

User-agent: URLy Warning
Disallow: /

User-agent: Wget/1.6
Disallow: /

User-agent: Wget/1.5.3
Disallow: /

User-agent: Wget
Disallow: /

User-agent: LinkWalker
Disallow: /

User-agent: cosmos
Disallow: /

User-agent: moget
Disallow: /

User-agent: hloader
Disallow: /

User-agent: humanlinks
Disallow: /

User-agent: LinkextractorPro
Disallow: /

User-agent: Offline Explorer
Disallow: /

User-agent: Mata Hari
Disallow: /

User-agent: LexiBot
Disallow: /

User-agent: Web Image Collector
Disallow: /

User-agent: The Intraformant
Disallow: /

User-agent: True_Robot/1.0
Disallow: /

User-agent: True_Robot
Disallow: /

User-agent: BlowFish/1.0
Disallow: /

User-agent: JennyBot
Disallow: /

User-agent: MIIxpc/4.2
Disallow: /

User-agent: BuiltBotTough
Disallow: /

User-agent: ProPowerBot/2.14
Disallow: /

User-agent: BackDoorBot/1.0
Disallow: /

User-agent: toCrawl/UrlDispatcher
Disallow: /

User-agent: WebEnhancer
Disallow: /

User-agent: suzuran
Disallow: /

User-agent: TightTwatBot
Disallow: /

User-agent: VCI WebViewer VCI WebViewer Win32
Disallow: /

User-agent: VCI
Disallow: /

User-agent: Szukacz/1.4
Disallow: /

User-agent: QueryN Metasearch
Disallow: /

User-agent: Openfind data gatherer
Disallow: /

User-agent: Openfind
Disallow: /

User-agent: Xenu's Link Sleuth 1.1c
Disallow: /

User-agent: Xenu's
Disallow: /

User-agent: Zeus
Disallow: /

User-agent: RepoMonkey Bait & Tackle/v1.01
Disallow: /

User-agent: RepoMonkey
Disallow: /

User-agent: Microsoft URL Control
Disallow: /

User-agent: Openbot
Disallow: /

User-agent: URL Control
Disallow: /

User-agent: Zeus Link Scout
Disallow: /

User-agent: Zeus 32297 Webster Pro V2.9 Win32
Disallow: /

User-agent: Webster Pro
Disallow: /

User-agent: EroCrawler
Disallow: /

User-agent: LinkScan/8.1a Unix
Disallow: /

User-agent: Keyword Density/0.9
Disallow: /

User-agent: Kenjin Spider
Disallow: /

User-agent: Iron33/1.0.2
Disallow: /

User-agent: Bookmark search tool
Disallow: /

User-agent: GetRight/4.2
Disallow: /

User-agent: FairAd Client
Disallow: /

User-agent: Gaisbot
Disallow: /

User-agent: Aqua_Products
Disallow: /

User-agent: Radiation Retriever 1.1
Disallow: /

User-agent: Flaming AttackBot
Disallow: /

User-agent: Oracle Ultra Search
Disallow: /

User-agent: MSIECrawler
Disallow: /

User-agent: PerMan
Disallow: /

User-agent: searchpreview
Disallow: /

User-agent: TurnitinBot
Disallow: /

User-agent: ExtractorPro
Disallow: /

User-agent: WebZIP/4.21
Disallow: /

User-agent: WebZIP/5.0
Disallow: /

User-agent: HTTrack 3.0
Disallow: /

User-agent: TurnitinBot/1.5
Disallow: /

User-agent: WebCopier v3.2a
Disallow: /

User-agent: WebCapture 2.0
Disallow: /

User-agent: WebCopier v.2.2
Disallow: /

REFERENCES
http://zmievski.org

Mapping keys in Vim - Tutorial (Part 1)

SkyHi @ Friday, December 31, 2010
To display the mode specific maps, prefix the ':map' command with the letter representing the mode.

:nmap - Display normal mode maps :imap - Display insert mode maps
:vmap - Display visual and select mode maps
:smap - Display select mode maps
:xmap - Display visual mode maps
:cmap - Display command-line mode maps
:omap - Display operator pending mode maps


REFERENCES
http://vim.wikia.com/wiki/Mapping_keys_in_Vim_-_Tutorial_%28Part_1%29

Thursday, December 30, 2010

Cassandra vs MongoDB vs CouchDB vs Redis vs Riak vs HBase comparison

SkyHi @ Thursday, December 30, 2010

While SQL databases are insanely useful tools, their tyranny of ~15 years is coming to an end.
And it was just time: I can't even count the things that were forced into relational databases,
but never really fitted them.

But the differences between "NoSQL" databases are much bigger than it ever was between one
SQL database and another. This means that it is a bigger responsibility on
software architects
to choose the appropriate one for a project right at the beginning.

In this light, here is a comparison of
Cassandra,
Mongodb,
CouchDB,
Redis,
Riak and
HBase:


CouchDB


  • Written in: Erlang
  • Main point: DB consistency, ease of use
  • License: Apache
  • Protocol: HTTP/REST
  • Bi-directional (!) replication,
  • continuous or ad-hoc,
  • with conflict detection,
  • thus, master-master replication. (!)
  • MVCC - write operations do not block reads
  • Previous versions of documents are available
  • Crash-only (reliable) design
  • Needs compacting from time to time
  • Views: embedded map/reduce
  • Formatting views: lists & shows
  • Server-side document validation possible
  • Authentication possible
  • Real-time updates via _changes (!)
  • Attachment handling
  • thus, CouchApps (standalone js apps)
  • jQuery library included

Best used:
For accumulating, occasionally changing data, on which pre-defined queries are to be run. Places where versioning is important.

For example:
CRM, CMS systems. Master-master replication is an especially interesting feature, allowing easy multi-site deployments.




Redis


  • Written in: C/C++
  • Main point: Blazing fast
  • License: BSD
  • Protocol: Telnet-like
  • Disk-backed in-memory database,
  • but since 2.0, it can swap to disk.
  • Master-slave replication
  • Simple keys and values,
  • but complex operations like ZREVRANGEBYSCORE
  • INCR & co (good for rate limiting or statistics)
  • Has sets (also union/diff/inter)
  • Has lists (also a queue; blocking pop)
  • Has hashes (objects of multiple fields)
  • Of all these databases, only Redis does transactions (!)
  • Values can be set to expire (as in a cache)
  • Sorted sets (high score table, good for range queries)
  • Pub/Sub and WATCH on data changes (!)

Best used:
For rapidly changing data with a foreseeable database size (should fit mostly in memory).

For example:
Stock prices. Analytics. Real-time data collection. Real-time communication.






MongoDB


  • Written in: C++
  • Main point: Retains some friendly properties of SQL. (Query, index)
  • License: AGPL (Drivers: Apache)
  • Protocol: Custom, binary (BSON)
  • Master/slave replication
  • Queries are javascript expressions
  • Run arbitrary javascript functions server-side
  • Better update-in-place than CouchDB
  • Sharding built-in
  • Uses memory mapped files for data storage
  • Performance over features
  • After crash, it needs to repair tables

Best used:
If you need dynamic queries. If you prefer to define indexes, not map/reduce functions. If you need good performance on a big DB. If you wanted CouchDB, but your data changes too much, filling up disks.

For example:
For all things that you would do with MySQL or PostgreSQL, but having predefined columns really holds you back.




Cassandra


  • Written in: Java
  • Main point: Best of BigTable and Dynamo
  • License: Apache
  • Protocol: Custom, binary (Thrift)
  • Tunable trade-offs for distribution and replication (N, R, W)
  • Querying by column, range of keys
  • BigTable-like features: columns, column families
  • Writes are much faster than reads (!)
  • Map/reduce possible with Apache Hadoop
  • I admit being a bit biased against it, because of the bloat and complexity it has partly because of Java (configuration, seeing exceptions, etc)

Best used:
If you're in love with BigTable. :) When you write more than you read (logging). If every component of the system must be in Java. ("No one gets fired for choosing Apache's stuff.")

For example:
Banking, financial industry






Riak


  • Written in: Erlang & C, some Javascript
  • Main point: Fault tolerance
  • License: Apache
  • Protocol: HTTP/REST
  • Tunable trade-offs for distribution and replication (N, R, W)
  • Pre- and post-commit hooks,
  • for validation and security.
  • Built-in full-text search
  • Map/reduce in javascript or Erlang
  • Comes in "open source" and "enterprise" editions

Best used:
If you want something Cassandra-like (Dynamo-like), but no way you're gonna deal with the bloat and complexity. If you need very good single-site scalability, availability and fault-tolerance, but you're ready to pay for multi-site replication.

For example:
Point-of-sales data collection. Factory control systems. Places where even seconds of downtime hurt.




HBase


(With the help of ghshephard)

  • Written in: Java
  • Main point: Billions of rows X millions of columns
  • License: Apache
  • Protocol: HTTP/REST (also Thrift)
  • Modeled after BigTable
  • Map/reduce with Hadoop
  • Query predicate push down via server side scan and get filters
  • Optimizations for real time queries
  • A high performance Thrift gateway
  • HTTP supports XML, Protobuf, and binary
  • Cascading, hive, and pig source and sink modules
  • Jruby-based (JIRB) shell
  • No single point of failure
  • Rolling restart for configuration changes and minor upgrades
  • Random access performance is like MySQL

Best used:
Use it when you need random, realtime read/write access to your Big Data.

For example:
Facebook Messaging Database (more general example coming soon)





Of course, all systems have much more features than what's listed here. I only wanted to list the key points that I base my decisions on. Also, development of all are very fast, so things are bound to change. I'll do my best to keep this list updated.

-- Kristof


REFERENCES

http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis

keep the Sent Items in sync across machines via imap and SquirrelMail

SkyHi @ Thursday, December 30, 2010
1. verify Squirrelamil -> Options-> Folder Preferences -> Sent Foler: Sent

2. Enable IMAP on Outlook 2007 for both machines

3. The first time you send an e-mail message with your IMAP account, you are prompted to choose the folder where you want sent items saved. Pick custom Folder -> mail->Sent

OR
   1. On the Tools menu, click Account Settings.
   2. Select an e-mail account that is not an Exchange account, and then click Change.
   3. Click More Settings.
   4. In the Internet E-mail Settings dialog box, click the Folders tab.
   5. Click Choose an existing folder or create a new folder to save your sent items for this account in, expand the folder list, and then click a Folder -> mail->Sent

4. Click Send/Receive button on the other machine. Now, both machine and SquirrelMail should contain the e-mail.





References
http://office.microsoft.com/en-us/outlook-help/change-where-sent-e-mail-messages-are-saved-HA010164216.aspx
http://www.entourage.mvps.org/database/sync.html
http://www.sevenforums.com/browsers-mail/45179-how-save-sent-items-imap-server-live-mail.html
http://www.question-defense.com/2009/03/05/in-outlook-2007-save-copy-of-sent-pop-account-messages-to-gmail-imap-sent-folder
http://www.msoutlook.info/question/486
http://www.ehow.com/how_4831673_save-sent-emails-imap-folder.html