Sunday, May 30, 2021

High Availability Graphite Architecture

Go to start of metadata

Features

Graphite service scaled out using built-in sharding and replication capabilities of the carbon component.
Two identical clusters will be replicated among separate geographical sites and both will be available to receive incoming metrics and serve queries (master-master). Within each cluster, data are sharded among each of the hosts in the cluster, such that each host has identical capabilities. This configuration provides the following features:
  • Increased capacity with the ability to scale in the future.
  • Data redundancy tolerating individual server downtime.
  • BCP - continued availability of service in the event of a complete site failure with no action required during the outage. Post-recovery steps should be done to sync whisper files to failed site with missing metrics.

Architecture





Each host in each cluster runs three tiers of metric routing for processing incoming metric data. At each tier, HAProxy is used to load-balance among a set of local carbon cache/relay daemons. Multiple carbon daemons are employed at each level to maximize utilization of multiple CPU cores/threads, thus the purpose of HAProxy at each level.
  • Tier 1, replication: HAProxy distributes port 2003, 2004 (line-formatted and pickled formatted, accordingly) traffic originating from a data source (client API) to the carbon-relay daemons. Relays at this level are responsible for replicating metrics across sites (mirroring) by forwarding a copy of each metric locally to Tier 2 and remotely to a load-balanced VIP representing Tier 2 of each mirror.
  • Tier 2, sharding: HAProxy distributes port 2007 traffic consisting of replicated metrics to the carbon-relay daemons responsible for sharding metrics across all hosts of a site (cluster). Relays at this level determines the proper destination host for each metric and forward the metrics along to Tier 3 of that host accordingly (local or remote).
  • Tier 3, local relay fanout: HAProxy distributes port 2009 sharded traffic, directed by Tier 2 to the carbon-relay daemons responsible for sharding locally among a set of carbon-cache daemons. Fanout at this level is simply for the purpose of maximizing utilization of multiple CPU cores/threads on the server. The carbon-cache daemons finally write metrics to whisper database files, each responsible for its own sharded set of metrics.

Thursday, January 23, 2014

Fedora setup on ThinkPad (lenovo T530)

Intro

This document describes the steps I took to get a sane working environment on my ThinkPad laptop (lenovo T530). While the gap of usability and power as a developer between Linux and Mac is closing, Linux/PC is still my workhorse of choice. First of all, most every package you can think of is at your fingertips with a simple {yum,apt-get} install package... and if the software you're looking for is not there, download/untar/configure/make will get you anything else you need with the least amount of grief using Linux. In terms of full hardware/peripheral support of your machine, integration with the OS and general consumer usability, Mac of course is king, and this can be a substantial argument for choosing Mac (for some, the deciding factor). However, I've had at least one major incident where software behaved significantly different on a Mac vs Linux (incorrectly on Mac). Luckily, this particular case was easy to identify but others may not be so easy, meaning less surprises for me when I take some finished code to the test environment. Also, I have a dual boot setup with Windows as I still have that occasional document or app requirement come across my desk that will only run on Windows (YMMV on Mac), e.g. webex. Don't get me wrong, Mac is still a very fine choice and if I had to, I'd be productive on a Mac as well, but until we start running Macs in our production environment, I'll likely be developing on Linux.

Target Audience

These instructions are aimed at developers who prefer a Linux OS, while I chose Fedora (19), much of this will apply to any Linux distro (Mint with the Mate desktop is highly recommended) and some of it may even be helpful to Mac users.

OS Install

Disk partitioning and dual boot with Windows

If you want a dual boot setup with Windows, you'll have to make a compromise on space. I was handed a laptop with a 180G SSD drive, with only 40G available for Linux. I'm compensating for such little disk space by using the large Windows partition as a data partition within Linux, where I store big more-or-less read-only files and data stores (hadoop, etc). While 180G is not much to begin with, it is SSD so the benefits far outweigh the limited space. You have a few other choices: take your chances with GParted, re-install Windows on a smaller partition, scavenge the 15G or so recovery partition, or don't run Windows at all and have the entire drive for Linux. For my setup, I simply used the partition shrink operation built-in to Windows 7 Professional - vaguely, Control Panel -> Administrative tools -> Disk Utility, right click on the large NTFS partition and you should see a Shrink option. For whatever reason this only freed up 42G, although Windows doesn't need the rest (even after defragmentation), it seems to sit on it anyway, I could have taken my chances with GParted but will take what I have at this point.

Burn ISO and install

Most distros will have an ISO that can be written to a USB drive, grab a thumb-drive from the break/supply room, download and write the ISO image with dd or some Windows utility if you don't have a UNIX machine handy (try this or this).
Plug the USB drive into your laptop and boot while booting press F1 (lenovo specific) to get the BIOS settings. Make sure your USB drive appears in boot order before the resident hard disk. Save the BIOS, reboot and follow along to install your distro on your free partition(s). During this procedure I created a 2G swap partition so my final size for Fedora was 40G. Don't bother with separate partitions for /usr or /boot, /var, etc, or logical volumes, just put everything under one partition. Create a non-root username with admin (wheel group) privileges for yourself where you'll be doing your development, pick the same username you'll have for your production system account, usually the same as your email address, first name initial and last name (rsandberg) - this will just make things easier for you (e.g. ssh hostname).
With your blazing SSD drive, the install should just take a few minutes. When complete, remove the USB drive, reboot and set your BIOS back to normal. At this point, if you opted for dual boot, any self-respecting distro would have detected your Windows partition and replaced your MBR with a pointer to grub, which should give you an option to boot to Windows on startup. After booting into Linux and logon, find a console (terminal) and run:
sudo yum update
(or apt equivalent, apt-get update; apt-get dist-upgrade)

Windows Managers

Fedora's default windows manager is Gnome - specifically Gnome 3. This is a complete re-design from past gnome desktops and if you're used to Gnome 2, this will be a completely different experience. I had a difficult time getting productive with it and apparently many others have as well (including Linus Torvalds himself) so Mate was born as a fork of Gnome 2. I ended up using Mate simply by yum installing the Mate packages, the Fedora-Mate spin (distro) is an option as well when installing the OS:
sudo yum install https://dl.dropbox.com/u/105479527/Mate-Desktop/fedora-release-extra-19/mate-desktop-fedora/noarch/mate-desktop-extra-release-19-1.fc19.noarch.rpm
sudo yum install @mate-desktop
sudo yum groupinstall 'MATE Desktop Extra'
sudo yum groupinstall 'Nonfree packages for Mate Desktop (rpmfusion needed)'
On reboot, you'll see an option under your login name to choose which windows manager you want to use.

General setup

Data partition

If you're limited on space on your Linux partition (dual boot), make the Windows partition available for data storage - as root:
mkdir /win-c
add to/etc/fstab:
/dev/sda2   /win-c                  ntfs    defaults        0 0
mount /win-c
mkdir /win-c/u0l
ln -s /win-c/u01 /u01

Developer setup

Utilities and useful packages
The following packages will be useful/necessary:
sudo yum install @development-tools
sudo yum install gitk.noarch
sudo yum install sharutils
Git-prompt
Read the beginning of /usr/share/git-core/contrib/completion/git-prompt.sh for instructions on incorporating vital git info as part of your shell prompt when you are working in a git repo.
cp /usr/share/git-core/contrib/completion/git-prompt.sh ~/etc/
Source ~/etc/git-prompt-sh in ~/.bashrc and modify PS1, etc accordingly (see PS1 example elsewhere in this doc).
bash_completion
Tab-completion in bash is extremely useful (almost necessary), however default settings are too restrictive for me - many times I'll use it to quickly browse a directory looking for a file before I cd there, whereas default tab completion for the cd command limits output to directory entries only (tab-tab does not produce a complete list directory contents). The following patch to /usr/share/bash-completion/bash_completion is more productive for me in using common directory commands:
Copy and paste
Linux copy and paste is as simple as selecting the text you want to copy and then clicking your middle mouse button wherever you want to paste. "Where's the middle button?" you may ask, well typically it's your scroll wheel, so I find it much easier to "emulate" the middle button by clicking the left and right buttons simultaneously. To enable this emulation add the following to /etc/X11/xorg.conf.d/00-pointing.conf
/etc/X11/xorg.conf.d/00-pointing.conf
Section "InputClass"
    Identifier "middle button emulation class"
    MatchIsPointer "on"
    Option "Emulate3Buttons" "on"
EndSection
Disable Security-Enhanced Linux
For development, SELinux will cause unneeded troubleshooting and headaches, edit /etc/sysconfig/selinux set:
SELINUX=disabled
Ubuntu's equivalient is AppArmor.
Passive sudo
Save some typing when you need to run sudo on your machine and disable the password requirement for your account: run visudo as root, comment out the first %wheel... line and uncomment the second. This assumes you granted yourself admin access or otherwise added yourself to the 'wheel' group membership. The file /etc/sudoers should look something like this:
## Allow root to run any commands anywhere
root    ALL=(ALL)       ALL
## Allows members of the 'sys' group to run networking, software,
## service management apps and more.
# %sys ALL = NETWORKING, SOFTWARE, SERVICES, STORAGE, DELEGATING, PROCESSES, LOCATE, DRIVERS
## Allows people in group wheel to run all commands
# %wheel        ALL=(ALL)       ALL
## Same thing without a password
%wheel  ALL=(ALL)       NOPASSWD: ALL
Command history
Another bash feature I'm completely dependent on is command history searching. The pattern I use is:
  • Type Ctrl-R at the bash prompt to enter search mode
  • Enter a regular expression, usually just text from a command I'm trying to remember, e.g. 'find' or 'tcpdump'
  • Continue searching backward (Ctrl-R) or forward (Ctrl-S) repeatedly until I find the exact command I ran previously
  • Hit Esc to exit search mode
  • Optionally edit the command, then Enter to execute.
If I don't find what I'm looking for, or I'm just using history for reference, (which I often do), I want to return the command prompt to normal, this is done with end-of-history (Alt->), but I prefer to use the Page-Down key for this and distros have been binding Page-Up/Page-Down to reverse/forward search (Ctrl-R, Ctrl-S) so edit /etc/inputrc to recover these bindings:
/etc/inputrc
# Restored keymappings for pgup/pgdown to reach begin/end of history
"\e[5~": beginning-of-history
"\e[6~": end-of-history
#"\e[5~": history-search-backward
#"\e[6~": history-search-forward
I also set this in my ~/.inputrc on my server account.
These environment variables allow virtually unlimited command history, these should also be set in ~/.bashrc on your server account, as there will be many commands you use there for troubleshooting, etc that you'll want to recall.
export HISTFILESIZE=100000000
export HISTSIZE=100000000
vim
Some useful parts of my ~/.vimrc, you'll also want to put these in ~/.vimrc on your server account. Please add any useful settings or plugin suggestions you want to share.
~/.vimrc
set bs=2
set ignorecase
set showmatch
let loaded_matchparen = 1
set nowrap
set dir=~/etc/tmp
set smartindent
set smarttab
set expandtab
set ruler
set shiftwidth=4
set ls=2
set errorformat=\"../../%f\"\\,%*[^0-9]%l:\ %m
set ch=2
set autowrite
set iskeyword=@,48-57,_,192-255
" stop auto-commenting once a comment has started - may want to limit this to
" one-liner comments #, --, // ?
set formatoptions-=c formatoptions-=r formatoptions-=o
nmap ' :cn
autocmd BufEnter ?akefile set noexpandtab nosmarttab nosmartindent shiftwidth=4
autocmd FileType ruby set shiftwidth=2 formatoptions-=c formatoptions-=r formatoptions-=o
" Toggle between paste and nopaste with Ctrl+P
" Type Ctrl-P before pasting text from a browser, etc
" otherwise smartindent and other settings, etc will try to format it
" and it will be useless. Ctrl-P again will disable paste mode for normal
" editing.
nm <C-P> :se invpaste paste?<CR>
" unlimited command history
set history=99999 

Environment settings
The observant reader will notice that most configuration so far has been made globally - this could be considered bad practice on a shared server but since I'm the only one using my machine, it just makes things easier if I'm testing services under different accounts, etc and I want a consistent environment throughout. To achieve this with environment variables, I created /etc/profile.d/local.sh, with the contents below. Some of these settings are explained in more detail in other places in this document but here it is in its entirety. Many of these you'll also want to set in ~/.bashrc on your server account.
/etc/profile.d/local.sh
# Show matches in bold without colorization
export GREP_COLOR=01
# Move these from colorls.sh since it was removed from /etc/profile.d sourcing
alias ll='ls -l' 2>/dev/null
alias l.='ls -d .*' 2>/dev/null
# Aliases I use all the time
alias la='ls -la'
alias ltr='ls -latr'
alias ltrh='ls -latrh'
alias psa='ps auxf'
# Set unlimited command-line history
export HISTFILESIZE=100000000
export HISTSIZE=100000000
# Save typing, so I can do a simple ssh eqv-21 instead of ssh eqv-21.rfiserve.net
export LOCALDOMAIN="rfiserve.net"
# Case-insensitive matching in less and all apps that use less as a pager (man, etc)
export LESS='-i'
# So I can use environment variables in paths while I'm using tab completion (cd $HADOOP_PREFIX/<Tab><Tab>)
shopt -s direxpand
export JAVA_HOME=/etc/alternatives/java_sdk
export MAVEN_HOME=/opt/apache-maven
export PATH=$MAVEN_HOME/bin:$PATH
export HADOOP_PREFIX="/opt/hadoop"
export PATH=$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin:$PATH
export HADOOP_MAPRED_HOME=${HADOOP_PREFIX}
export HADOOP_COMMON_HOME=${HADOOP_PREFIX}
export HADOOP_HDFS_HOME=${HADOOP_PREFIX}
export YARN_HOME=${HADOOP_PREFIX}

Some settings which should only be applied to my specific user account go in ~/.bashrc - also may go in ~/.bashrc on your server accounts.
~/.bashrc
# Play nice
umask 002
# This deserves some explanation:
# Your default $TERM setting will have support for more than one
# buffer (screen) at a time. This is great and all but vi/vim/view, less, man
# and other commands use an alternate buffer to take a snapshot
# of your screen before you run such commands and then replaces your screen
# with that snapshot once you finish the command. Invariably, I have opened
# a manpage, or run less on a file - even vi/view a file for reference and I'll
# want to copy/paste stuff from it after exiting, not possible once the
# terminal wipes the text you were looking for and replaces it with the
# previous view of the screen. The vt100 (virtual) terminal does not support
# multiple buffers, so this is the easiest way I've found to copy/paste
# useful stuff I've found from running such commands.
export TERM=vt100
# Source for git status on your bash prompt when working in a git repo (details elsewhere in this doc)
. ~/etc/git-prompt.sh
# git pleasure, refer to the top of ~/etc/git-prompt.sh for details
export PS1='[\u@\h \W$(__git_ps1 " (%s)")]\$ '
# export GIT_PS1_SHOWDIRTYSTATE=true
# export GIT_PS1_SHOWSTASHSTATE=true
# export GIT_PS1_SHOWUNTRACKEDFILES=true
export GIT_PS1_SHOWUNTRACKEDFILES=
export GIT_PS1_SHOWUPSTREAM="auto"

ssh server
If you want to be able to use your machine to accept ssh logins (testing from another machine, etc):
sudo systemctl enable sshd.service
sudo systemctl start sshd.service
Ubuntu uses upstart (sudo stop/start/restart).

Colors, Fonts, Audio, Video and Nonfree packages

Here non-free means your freedom to distribute may be limited, rather than meaning you have to pay for something, and usually this amounts to complicated licensing interactions with GPL. In any case, Fedora takes this seriously, whereas other distros (Ubuntu, Mint, etc) not so seriously. So unless you're also committed to Fedora's philosophy, this just means a few extra steps are needed for using patented font and media rendering etc, that most of us take for granted.
General references
The biggest bang for your buck in getting nonfree stuff on Fedora is with Fedora Utils:
sudo yum localinstall --nogpgcheck http://download1.rpmfusion.org/free/fedora/rpmfusion-free-release-branched.noarch.rpm http://download1.rpmfusion.org/nonfree/fedora/rpmfusion-nonfree-release-branched.noarch.rpm
su -c "curl http://download.opensuse.org/repositories/home:/satya164:/fedorautils/Fedora_19/home:satya164:fedorautils.repo -o /etc/yum.repos.d/fedorautils.repo && yum install fedorautils"
Once installed, run the Fedora Utils GUI app, I installed just about everything available (Skype, Chrome, Oracle Java) - during the install you should see a dramatic change in your fonts. Install process is logged to /var/log/fedorautils.log.

Colors and Fonts

Unless you have severe visual impairment, this section may very well be the most important.
Considering that your eyeballs will be glued to your screen 8-12+ hours a day, visual ergonomics and comfort is absolutely key to having a long, successful career as a developer. Mac users definitely have the advantage here but Linux actually has incredible versatility for fine-tuning your visual experience for endless hours of coding bliss. Necessarily, the preferences given here will be very different user-to-user but general ideas will apply. My guidelines will focus on minimal color to be easy on your eyes as they move from window to window, browser to IDE to log files to terminal, etc. I spend most of my time on the terminal (gnome-terminal or mate-terminal apps), whether troubleshooting or coding (vim) so I have carefully chosen the color scheme on my terminal - black text on a faded green background. Many IDE's come with color schemes to highlight different lexical components, statements vs functions, etc right out of the box. Most Linux-for-the-desktop distros are set up with all kinds of flash, bells and whistles to attract tourists, and while I'm all for adding new members to the Linux community, all this flash is not sustainable for us long-time residents. Consider an example of color combinations likely to give you a headache after staring at it for too long:
While this is an extreme example, you can see how easily the wrong color scheme can cause stress - there's plenty of science and discussion behind it. With that in mind, I disable color highlighting on the terminal and terminal-based editors/IDE's (vim) and use a Solarized theme for the console itself (I spend a LOT of time on mate-terminal). A valid argument can be made for using color to highlight certain areas of terminal output (log searching/tailing, etc), which allows faster reading with less work for your eyes scanning through the screen. If you prefer some kind of highlighting, try different shades of your primary text color or just using the bold option of your font. Remember the point I'm making here is avoiding eye strain and protecting your eyes after prolonged coding sessions, late night troubleshooting, and so on over the period of years during your long career in technology. For IDE's I highly recommend Solarized color palettes (brilliant!). Ultimately, do what feels right for your eyes, your body will tell you what's best for you, but at least listen to it spend some time optimizing your visual experience; your eyes will thank you when you reach my age, still staring at a screen for hours on end. Here are my tweaks/settings:
# Fedora Utils adds some nasty color settings to /etc/bashrc, revert:
cd /etc
cp -p bashrc bashrc-terminal-colors
cp -p bashrc.bak bashrc
# The default ls color scheme will make your head spin, boldening/shading can be useful here
# especially for directories, so if you prefer, edit /etc/DIR_COLORS instead of disabling colorls.sh.
# As for me, I just remove all of it. This will remove aliases like ll, which I put back in local.sh.
cd /etc/profile.d
sudo mv colorls.sh colorls.sh~
# Same with grep, though boldening is very useful to save work on your eyes from scanning
# so set this in your environment (local.sh)
export GREP_COLOR=01
sudo yum install google-droid-sans-fonts.noarch google-droid-sans-mono-fonts.noarch google-droid-serif-fonts.noarch
sudo yum install freetype-freeworld.x86_64 freetype.x86_64

System font
From the System menu (if you're in Mate), select Preferences -> Appearance -> Theme tab, I use Clearlooks


My Fonts tab looks like this:


Click Details... I have 106 dots per inch, Subpixel Smoothing, Slight Hinting and RGB Subpixel Order. Notice the dramatic difference resolution has on how the fonts you choose are rendered and actually appear on the screen. Using a comfortable resolution is paramount to reducing eye strain, if you catch yourself squinting at all, increase this and/or font sizes (optimally, a balance between both).


Terminal settings (gnome-terminal and/or mate-terminal)
Under the General tab I use DejaVu Sans Mono Book (size 11), disable terminal bell (stops annoying beeps every time you use tab completion, which is all the time) and no menubar to conserve vertical space on my screen - you can get the menu back anytime with:
$ mate-terminal --show-menubar

Under the Colors tab I have set custom text (black, #23231A) and background (green, #718E71) colors.


Under Chrome Settings, Show advanced settings... -> Web content, click Customize fonts...


Other references

Video

Try VLC or XBMC as options for playing videos
sudo yum install xbmc.x86_64
sudo yum install vlc.x86_64

Sound

Overall, Fedora handles audio for this lenovo fairly well, no problems with headphones but I did have problems getting any sound from the laptop speakers. After a fair amount of hair pulling, I opened the laptop to expose the sound buttons on the top row of the built-in keyboard (mute, volume, etc), I then muted/un-muted a few times using the mute button and then pressed the volume button(s) until I got sound coming from the speakers. I must have hit the mute key when first booting to Windows to disable the annoying logon music. It seems that after doing that, Linux can only be un-muted by a sequence of un-mute and then volume button pushing. Thanks to these references:
From the second reference, it appears this is also a problem with Ubuntu/Mint so most likely a problem with kernel drivers.

Docking and Undocking

Docking and undocking are possible with this OS/hardware configuration without having to shutdown/reboot with the following quirk:
  • Undock - put your laptop in sleep mode and release (as normal).
  • Open the lid and use undocked as normal with laptop screen, keyboard, etc. Sleep/close, reopen repeatedly, no problems.
  • Re-dock - snap the laptop back in the docking station. Here's the problem, normally you'd just press the power button on the dock to revive. Instead, you have to open the lid while docked (use care when doing so), after a few seconds, you'll see your login screen appear on the big monitor, close the lid, wait a few seconds, then press the power button on the dock and your login screen will appear again - you're good.
Not sure of the status with Ubuntu/Mint on the T530 but likely kernel drivers as well.
Mac users, relish and beam with pride.

System Mail

Sometimes a local SMTP relay is useful for development and even sending email from the command-line. I have configured sendmail to send via my company gmail account.
Sendmail relay via gmail
# as root:
cd /etc/pki/tls/certs/
make sendmail.pem
cd /etc/mail
vi sendmail.mc
# The following lines are important:
define(`SMART_HOST', `smtp.gmail.com')dnl
FEATURE(`authinfo',`hash /etc/mail/authinfo')dnl
define(`confCACERT_PATH', `/etc/pki/tls/certs')dnl
define(`confCACERT', `/etc/pki/tls/certs/ca-bundle.crt')dnl
define(`confSERVER_CERT', `/etc/pki/tls/certs/sendmail.pem')dnl
define(`confSERVER_KEY', `/etc/pki/tls/certs/sendmail.pem')dnl
vi authinfo
# This is a new file with contents like the following (use your gmail credentials)
AuthInfo:smtp.gmail.com "U:root" "I:username@mycompany.com" "P:<your password>" "M:PLAIN"
AuthInfo:smtp.gmail.com:587 "U:root" "I:username@mycompany.com" "P:<your password>" "M:PLAIN"
chmod 600 authinfo
make
systemctl restart sendmail.service
# To test
echo test |mail -s 'test' some@email.address