Wednesday, December 9, 2009

sysctl - Kernel Optimization - /etc/sysctl.conf

SkyHi @ Wednesday, December 09, 2009
P Forwarding:
Is IP forwarding currently on?

/sbin/sysctl net.ipv4.ip_forward


Turn IP forwarding on manually

/sbin/sysctl -w net.ipv4.ip_forward=1


Turning IP packet forwarding off manually


[root@plain scripts]# /sbin/sysctl -w net.ipv4.ip_forward=1
net.ipv4.ip_forward = 1
[root@plain scripts]# /sbin/sysctl -w net.ipv4.ip_forward=0
net.ipv4.ip_forward = 0


The following command will do the same job as the above, but temporarily

echo 0 > /proc/sys/net/ipv4/ip_forward


Upon executing the above command, /etc/sysctl.conf file reflects the change

# Controls IP packet forwarding
net.ipv4.ip_forward = 0


By default, the /etc/sysctl.conf file looks like the following:

# Kernel sysctl configuration file for Red Hat Linux
#
# For binary values, 0 is disabled, 1 is enabled. See sysctl(8) and
# sysctl.conf(5) for more details.

# Controls IP packet forwarding
net.ipv4.ip_forward = 0

# Controls source route verification
net.ipv4.conf.default.rp_filter = 1

# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0

# Controls whether core dumps will append the PID to the core filename.
# Useful for debugging multi-threaded applications.
kernel.core_uses_pid = 1




I will be optimizing this file (use at own risk).

Lets check the current memory values for socket IO operations

[root@plain scripts]# /sbin/sysctl net.ipv4.ip_forward
net.ipv4.ip_forward = 0
[root@plain scripts]# /sbin/sysctl net.core.rmem_default
net.core.rmem_default = 65535
[root@plain scripts]# /sbin/sysctl net.core.rmem_max
net.core.rmem_max = 131071
[root@plain scripts]# /sbin/sysctl net.ipv4.tcp_rmem
net.ipv4.tcp_rmem = 4096 87380 174760
[root@plain scripts]# /sbin/sysctl net.core.wmem_default
net.core.wmem_default = 65535
[root@plain scripts]# /sbin/sysctl net.core.wmem_max
net.core.wmem_max = 131071
[root@plain scripts]# /sbin/sysctl net.ipv4.tcp_wmem
net.ipv4.tcp_wmem = 4096 16384 131072
[root@plain scripts]# /sbin/sysctl net.ipv4.tcp_mem
net.ipv4.tcp_mem = 195584 196096 196608
[root@plain scripts]# /sbin/sysctl net.core.optmem_max
net.core.optmem_max = 10240


I will be increasing the following values:

net.core.rmem_max (from 131071 to 8388608)
net.ipv4.tcp_rmem = from ("4096 87380 174760" to "4096 1048576 8388608")
net.core.wmem_max = (from 131071 to 8388608)
net.ipv4.tcp_wmem = (from 4096 16384 131072 to 4096 65535 8388608)
net.ipv4.tcp_mem = (from 195584 196096 196608 to 8388608 8388608 8388608)
net.core.optmem_max = (from 10240 to 40960)

Changes can be made by placing the following lines in /etc/sysctl.conf


net.core.rmem_max = 8388608
net.ipv4.tcp_rmem = 4096 1048576 8388608
net.core.wmem_max = 8388608
net.ipv4.tcp_wmem = 4096 1048576 8388608
net.ipv4.tcp_mem = 8388608 8388608 8388608
net.core.optmem_max = 40960


OR by issuing the following commands


/sbin/sysctl -w net.core.rmem_max=8388608
/sbin/sysctl -w net.ipv4.tcp_rmem="4096 1048576 8388608"
/sbin/sysctl -w net.core.wmem_max=8388608
/sbin/sysctl -w net.ipv4.tcp_wmem="4096 1048576 8388608"
/sbin/sysctl -w net.ipv4.tcp_mem="8388608 8388608 8388608"
/sbin/sysctl -w net.core.optmem_max=40960


Some more Optimizations

# tcp-time-wait buckets pool size from
# net.ipv4.tcp_max_tw_buckets = 180000
# to 360000
/sbin/sysctl -w net.ipv4.tcp_max_tw_buckets=360000
# Increase the maximum number of skb-heads to be cached from 128
/sbin/sysctl -w net.core.hot_list_length=256
# increase from 300 to 1024
/sbin/sysctl -w net.core.netdev_max_backlog=1024
#increase TCP Re-Ordering value in kernel from 3 to 5
/sbin/sysctl -w net.ipv4.tcp_reordering=5
# change from 0 to 1 to Enable ignoring broadcasts request
/sbin/sysctl -w net.ipv4.icmp_echo_ignore_broadcasts=1
# change from 0 to 1 to enable syn cookies protection
/sbin/sysctl -w net.ipv4.tcp_syncookies=1
# turn on TCP time stamps
/sbin/sysctl -w net.ipv4.tcp_timestamps=1
# change from 0 to 1 (fack was enabled already)
/sbin/sysctl -w net.ipv4.tcp_sack=1
# change from 0 to 1 for TCP window scaling
/sbin/sysctl -w net.ipv4.tcp_window_scaling=1
# decrease from 1400 to 1200 for tcp_keepalive_time connection
/sbin/sysctl -w net.ipv4.tcp_keepalive_time=1200
# decrease from 1400 to 25
/sbin/sysctl -w net.ipv4.tcp_fin_timeout=25
# change from 0 to 1 to Log Spoofed Packets, Source Routed Packets, Redirect Packets
/sbin/sysctl -w net.ipv4.conf.default.log_martians=1
/sbin/sysctl -w net.ipv4.conf.all.log_martians=1
#disable ICMP redirects
/sbin/sysctl -w net.ipv4.conf.default.accept_redirects=0
/sbin/sysctl -w net.ipv4.conf.all.accept_redirects=0
# disable IP source routing
/sbin/sysctl -w net.ipv4.conf.default.accept_source_route=0
/sbin/sysctl -w net.ipv4.conf.all.accept_source_route=0
# enable source route verification
/sbin/sysctl -w net.ipv4.conf.all.rp_filter=1
/sbin/sysctl -w net.ipv4.conf.default.rp_filter=1


For the above changes to take effect, we must do

/sbin/sysctl -p
# and
/sbin/sysctl -w net.ipv4.route.flush=1







sysctl -A provides the following output


abi.fake_utsname = 0
abi.trace = 0
abi.defhandler_libcso = 68157441
abi.defhandler_lcall7 = 68157441
abi.defhandler_elf = 0
abi.defhandler_coff = 117440515
dev.parport.default.spintime = 500
dev.parport.default.timeslice = 200
dev.raid.speed_limit_max = 10000
dev.raid.speed_limit_min = 100
dev.rtc.max-user-freq = 64
debug.rpmarch =
debug.kerneltype =
net.unix.max_dgram_qlen = 10
net.token-ring.rif_timeout = 60000
net.ipv4.ip_conntrack_max = 34576
net.ipv4.conf.eth0.force_igmp_version = 0
net.ipv4.conf.eth0.disable_policy = 0
net.ipv4.conf.eth0.disable_xfrm = 0
net.ipv4.conf.eth0.arp_filter = 0
net.ipv4.conf.eth0.tag = 0
net.ipv4.conf.eth0.log_martians = 0
net.ipv4.conf.eth0.bootp_relay = 0
net.ipv4.conf.eth0.medium_id = 0
net.ipv4.conf.eth0.proxy_arp = 0
net.ipv4.conf.eth0.accept_source_route = 1
net.ipv4.conf.eth0.send_redirects = 1
net.ipv4.conf.eth0.rp_filter = 1
net.ipv4.conf.eth0.shared_media = 1
net.ipv4.conf.eth0.secure_redirects = 1
net.ipv4.conf.eth0.accept_redirects = 1
net.ipv4.conf.eth0.mc_forwarding = 0
net.ipv4.conf.eth0.forwarding = 0
net.ipv4.conf.lo.force_igmp_version = 0
net.ipv4.conf.lo.disable_policy = 0
net.ipv4.conf.lo.disable_xfrm = 0
net.ipv4.conf.lo.arp_filter = 0
net.ipv4.conf.lo.tag = 0
net.ipv4.conf.lo.log_martians = 0
net.ipv4.conf.lo.bootp_relay = 0
net.ipv4.conf.lo.medium_id = 0
net.ipv4.conf.lo.proxy_arp = 0
net.ipv4.conf.lo.accept_source_route = 1
net.ipv4.conf.lo.send_redirects = 1
net.ipv4.conf.lo.rp_filter = 1
net.ipv4.conf.lo.shared_media = 1
net.ipv4.conf.lo.secure_redirects = 1
net.ipv4.conf.lo.accept_redirects = 1
net.ipv4.conf.lo.mc_forwarding = 0
net.ipv4.conf.lo.forwarding = 0
net.ipv4.conf.default.force_igmp_version = 0
net.ipv4.conf.default.disable_policy = 0
net.ipv4.conf.default.disable_xfrm = 0
net.ipv4.conf.default.arp_filter = 0
net.ipv4.conf.default.tag = 0
net.ipv4.conf.default.log_martians = 1
net.ipv4.conf.default.bootp_relay = 0
net.ipv4.conf.default.medium_id = 0
net.ipv4.conf.default.proxy_arp = 0
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.conf.default.send_redirects = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.shared_media = 1
net.ipv4.conf.default.secure_redirects = 1
net.ipv4.conf.default.accept_redirects = 0
net.ipv4.conf.default.mc_forwarding = 0
net.ipv4.conf.default.forwarding = 0
net.ipv4.conf.all.force_igmp_version = 0
net.ipv4.conf.all.disable_policy = 0
net.ipv4.conf.all.disable_xfrm = 0
net.ipv4.conf.all.arp_filter = 0
net.ipv4.conf.all.tag = 0
net.ipv4.conf.all.log_martians = 1
net.ipv4.conf.all.bootp_relay = 0
net.ipv4.conf.all.medium_id = 0
net.ipv4.conf.all.proxy_arp = 0
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.conf.all.send_redirects = 1
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.all.shared_media = 1
net.ipv4.conf.all.secure_redirects = 1
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.mc_forwarding = 0
net.ipv4.conf.all.forwarding = 0
net.ipv4.neigh.eth0.locktime = 100
net.ipv4.neigh.eth0.proxy_delay = 80
net.ipv4.neigh.eth0.anycast_delay = 100
net.ipv4.neigh.eth0.proxy_qlen = 64
net.ipv4.neigh.eth0.unres_qlen = 3
net.ipv4.neigh.eth0.gc_stale_time = 60
net.ipv4.neigh.eth0.delay_first_probe_time = 5
net.ipv4.neigh.eth0.base_reachable_time = 30
net.ipv4.neigh.eth0.retrans_time = 100
net.ipv4.neigh.eth0.app_solicit = 0
net.ipv4.neigh.eth0.ucast_solicit = 3
net.ipv4.neigh.eth0.mcast_solicit = 3
net.ipv4.neigh.lo.locktime = 100
net.ipv4.neigh.lo.proxy_delay = 80
net.ipv4.neigh.lo.anycast_delay = 100
net.ipv4.neigh.lo.proxy_qlen = 64
net.ipv4.neigh.lo.unres_qlen = 3
net.ipv4.neigh.lo.gc_stale_time = 60
net.ipv4.neigh.lo.delay_first_probe_time = 5
net.ipv4.neigh.lo.base_reachable_time = 30
net.ipv4.neigh.lo.retrans_time = 100
net.ipv4.neigh.lo.app_solicit = 0
net.ipv4.neigh.lo.ucast_solicit = 3
net.ipv4.neigh.lo.mcast_solicit = 3
net.ipv4.neigh.default.gc_thresh3 = 1024
net.ipv4.neigh.default.gc_thresh2 = 512
net.ipv4.neigh.default.gc_thresh1 = 128
net.ipv4.neigh.default.gc_interval = 30
net.ipv4.neigh.default.locktime = 100
net.ipv4.neigh.default.proxy_delay = 80
net.ipv4.neigh.default.anycast_delay = 100
net.ipv4.neigh.default.proxy_qlen = 64
net.ipv4.neigh.default.unres_qlen = 3
net.ipv4.neigh.default.gc_stale_time = 60
net.ipv4.neigh.default.delay_first_probe_time = 5
net.ipv4.neigh.default.base_reachable_time = 30
net.ipv4.neigh.default.retrans_time = 100
net.ipv4.neigh.default.app_solicit = 0
net.ipv4.neigh.default.ucast_solicit = 3
net.ipv4.neigh.default.mcast_solicit = 3
net.ipv4.ipfrag_secret_interval = 600
net.ipv4.tcp_low_latency = 0
net.ipv4.tcp_frto = 0
net.ipv4.tcp_tw_reuse = 0
net.ipv4.icmp_ratemask = 6168
net.ipv4.icmp_ratelimit = 100
net.ipv4.tcp_adv_win_scale = 2
net.ipv4.tcp_app_win = 31
net.ipv4.tcp_rmem = 4096 1048576 8388608
net.ipv4.tcp_wmem = 4096 1048576 8388608
net.ipv4.tcp_mem = 8388608 8388608 8388608
net.ipv4.tcp_dsack = 1
net.ipv4.tcp_ecn = 1
net.ipv4.tcp_reordering = 5
net.ipv4.tcp_fack = 1
net.ipv4.tcp_orphan_retries = 0
net.ipv4.inet_peer_gc_maxtime = 120
net.ipv4.inet_peer_gc_mintime = 10
net.ipv4.inet_peer_maxttl = 600
net.ipv4.inet_peer_minttl = 120
net.ipv4.inet_peer_threshold = 65664
net.ipv4.igmp_max_memberships = 20
net.ipv4.route.secret_interval = 600
net.ipv4.route.min_adv_mss = 256
net.ipv4.route.min_pmtu = 552
net.ipv4.route.mtu_expires = 600
net.ipv4.route.gc_elasticity = 8
net.ipv4.route.error_burst = 500
net.ipv4.route.error_cost = 100
net.ipv4.route.redirect_silence = 2048
net.ipv4.route.redirect_number = 9
net.ipv4.route.redirect_load = 2
net.ipv4.route.gc_interval = 60
net.ipv4.route.gc_timeout = 300
net.ipv4.route.gc_min_interval = 0
net.ipv4.route.max_size = 262144
net.ipv4.route.gc_thresh = 16384
net.ipv4.route.max_delay = 10
net.ipv4.route.min_delay = 2
net.ipv4.icmp_ignore_bogus_error_responses = 1
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.icmp_echo_ignore_all = 0
net.ipv4.ip_local_port_range = 32768 61000
net.ipv4.tcp_max_syn_backlog = 2048
net.ipv4.tcp_rfc1337 = 0
net.ipv4.tcp_stdurg = 0
net.ipv4.tcp_abort_on_overflow = 0
net.ipv4.tcp_tw_recycle = 0
net.ipv4.tcp_syncookies = 0
net.ipv4.tcp_fin_timeout = 1400
net.ipv4.tcp_retries2 = 15
net.ipv4.tcp_retries1 = 3
net.ipv4.tcp_keepalive_intvl = 75
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_time = 1400
net.ipv4.ipfrag_time = 30
net.ipv4.ip_dynaddr = 0
net.ipv4.ipfrag_low_thresh = 196608
net.ipv4.ipfrag_high_thresh = 262144
net.ipv4.tcp_max_tw_buckets = 360000
net.ipv4.tcp_max_orphans = 32768
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_syn_retries = 3
net.ipv4.ip_nonlocal_bind = 0
net.ipv4.ip_no_pmtu_disc = 0
net.ipv4.ip_autoconfig = 0
net.ipv4.ip_default_ttl = 64
net.ipv4.ip_forward = 0
net.ipv4.tcp_retrans_collapse = 1
net.ipv4.tcp_sack = 0
net.ipv4.tcp_window_scaling = 0
net.ipv4.tcp_timestamps = 0
net.core.divert_version = 0.46
net.core.hot_list_length = 256
net.core.optmem_max = 40960
net.core.message_burst = 50
net.core.message_cost = 5
net.core.mod_cong = 290
net.core.lo_cong = 100
net.core.no_cong = 20
net.core.no_cong_thresh = 20
net.core.netdev_max_backlog = 1024
net.core.dev_weight = 64
net.core.rmem_default = 65535
net.core.wmem_default = 65535
net.core.rmem_max = 8388608
net.core.wmem_max = 8388608
vm.skip_mapped_pages = 0
vm.stack_defer_threshold = 2048
vm.inactive_clean_percent = 30
vm.dcache_priority = 0
vm.hugetlb_pool = 0
vm.max_map_count = 65536
vm.max-readahead = 31
vm.min-readahead = 3
vm.page-cluster = 3
vm.pagetable_cache = 25 50
vm.kswapd = 512 32 8
vm.pagecache = 1 15 30
vm.overcommit_ratio = 50
vm.overcommit_memory = 0
vm.bdflush = 50 500 0 0 500 3000 80 50 0
kernel.mem_nmi_panic = 0
kernel.unknown_nmi_panic = 0
kernel.sercons_esc = -1
kernel.overflowgid = 65534
kernel.overflowuid = 65534
kernel.random.uuid = cedb8bcb-91f6-4c1b-9d14-b6b730773f2c
kernel.random.boot_id = fc2dab6b-51da-4d81-a89b-c0edc50158ec
kernel.random.write_wakeup_threshold = 128
kernel.random.read_wakeup_threshold = 64
kernel.random.entropy_avail = 4096
kernel.random.poolsize = 512
kernel.pid_max = 32768
kernel.threads-max = 14336
kernel.cad_pid = 1
kernel.sysrq-timer = 10
kernel.sysrq-sticky = 0
kernel.sysrq-key = 84
kernel.sysrq = 0
kernel.sem = 250 32000 32 128
kernel.msgmnb = 16384
kernel.msgmni = 16
kernel.msgmax = 8192
kernel.shmmni = 4096
kernel.shmall = 2097152
kernel.shmmax = 33554432
kernel.rtsig-max = 1024
kernel.rtsig-nr = 0
kernel.acct = 4 2 30
kernel.hotplug = /sbin/hotplug
kernel.modprobe = /sbin/modprobe
kernel.printk = 6 4 1 7
kernel.ctrl-alt-del = 0
kernel.real-root-dev = 256
kernel.task_size = -1073741824
kernel.cap-bound = -257
kernel.tainted = 0
kernel.core_pattern = core
kernel.core_setuid_ok = 0
kernel.core_uses_pid = 1
kernel.use-nx = 0
kernel.exec-shield-randomize = 1
kernel.exec-shield = 1
kernel.print_fatal_signals = 0
kernel.panic_on_oops = 1
kernel.panic = 0
kernel.domainname = (none)
kernel.hostname = srv30.hostingandbeyond.com
kernel.version = #1 SMP Wed Dec 1 21:59:02 EST 2004
kernel.osrelease = 2.4.21-27.ELsmp
kernel.ostype = Linux
fs.quota.syncs = 7
fs.quota.free_dquots = 0
fs.quota.allocated_dquots = 0
fs.quota.cache_hits = 0
fs.quota.writes = 0
fs.quota.reads = 0
fs.quota.drops = 0
fs.quota.lookups = 0
fs.aio-pinned = 0
fs.aio-max-pinned = 131064
fs.aio-max-size = 131072
fs.aio-max-nr = 65536
fs.aio-nr = 0
fs.lease-break-time = 45
fs.dir-notify-enable = 1
fs.leases-enable = 1
fs.overflowgid = 65534
fs.overflowuid = 65534
fs.dentry-state = 525012 461413 45 0 0 0
fs.file-max = 209702
fs.file-nr = 1254 542 209702
fs.inode-state = 894122 383534 0 0 0 0 0
fs.inode-nr = 894122 383534




4 Comments:

At Thu Nov 23, 05:03:00 AM, Anonymous Anonymous said...

Hi Frank,

I'm glad to read your blog. I have some question about the setting in sysctl.conf, hope you could help.

================================
net.ipv4.tcp_rmem = 4096 1048576 8388608
================================
I understand the above setting which increase the tcp buffer size if it set net.ipv4.tcp_rmem = 1048576. But there are there value, 4096 1048576 8388608, on the setting. What's the meaning of these?

 
At Fri Nov 24, 12:39:00 PM, Blogger Frankly Speaking! said...

Thank you.

To answer your question, the values represent minimum, default and maximum bytes to use for the receive buffer of a socket.

Let me know if that answers your question or if you have any other questions.

You may also want to read this paper:
Flow Control in the Linux Network Stack

Frank

 
At Sun Dec 10, 05:47:00 PM, Anonymous Anonymous said...

Hi Frank,

Thank you for the write up. Do you know why the following message appear?

error: 'net.core.hot_list_length' is an unknown key

This is my kernel version (CentOS 4.4)

Linux skipjack.bigfish.net 2.6.9-42.0.3.ELsmp #1 SMP Fri Oct 6 06:21:39 CDT 2006 i686 i686 i386 GNU/Linux

 
At Thu Nov 23, 05:05:00 AM, Anonymous Anonymous said...

Hi Frank,

I'm glad to read your blog. I have some question about the setting in sysctl.conf, hope you could help.

================================
net.ipv4.tcp_rmem = 4096 1048576 8388608
================================
I understand the above setting which increase the tcp buffer size if it set net.ipv4.tcp_rmem = 1048576. But there are there value, 4096 1048576 8388608, on the setting. What's the meaning of these?

Reference: http://frankmash.blogspot.com/2005/11/sysctl-kernel-optimization.html