Search This Blog

Wednesday, July 20, 2011

Adding High (Private) or Low (Public) priority heartbeat link.

Other day, I had a very urgent requirement to implement VCS with guns pointing @ me :). When I checked for the hardware I found I can start the cluster only with one Private & one Public heartbeat link, which left the cluster in Jeopardy state the moment it started.
This was quiet obvious as VCS would atleast require two Private links with very less/minimal access to Low-priority links (Not recommended by Symantec) to have greater security and smooth functioning.
Mean while I got the hardware and wondered whether I can add them to cluster without any disruption to the running applications??? Thanks for the intuitive solution from Symantec team. !!!!.
Having faced the heat would like to share these with you all…..Lets begin the sneak-peek :)


Configuring heartbeat links is done as follows

Make sure your hardware is connected to the servers and OS can see it using dladm or kstat (Solaris 9 and less) or whichever is suitable.

# dladm show-dev
e1000g0         link: up        speed: 1000  Mbps       duplex: full
e1000g1         link: up        speed: 1000  Mbps       duplex: full
e1000g2         link: unknown   speed: 0     Mbps       duplex: half
#


To add a new high priority link (private heartbeat) while Low Latency transport is active, use the following command on each node:

# lltconfig -t <if alias name> -d <device> -b ether

Ex: # lltconfig -t e1000g2 -d /dev/e1000g2 -b ether


To add a new low priority link (utilizing a public interface),

# lltconfig -t <alias> -l -d <device> -b ether

Ex: lltconfig -t e1000g0 -l -d /dev/e1000g2 -b ether
NOTE: -l represents the low-priority link.


Here we are, now we can see the new NIC gets added under online private interconnect configuration on both the nodes,

# lltstat -l |grep link
 
link 0 e100g0 on etherfp lowpri
 
link 1 e1000g1 on etherfp hipri
 
link 2 e1000g2 on etherfp hipri  -à Newly added.


But, there is a catch here. The word, I mentioned “ONLINE” means the configuration will only remain active till the server reboots or until the cluster is offline because, whenever LLT is started its reads the “/etc/llttab” file and loads the devices appropriately.

                No were till now, we have added the new heartbeat link (e100g2) to “/etc/llttab”. Let’s pull the trigger.

Open the file using any UNIX editor (mostly vi) and before that, take a copy of LLT configuration file (/etc/llttab) it always saves you from troubles, otherwise. Then append the following entries on both the nodes.
               
After editing, the file looks as below.


# cat /etc/llttab
set-node testclus
set-cluster 100
link-lowpri e1000g0 /dev/e1000g0 - ether - -
link e1000g1 /dev/e1000g1 - ether - -
link e1000g2 /dev/e1000g2 - ether - -   ]--à Newly appended.


Upon reboot of the system, the VCS configuration will read the additional link that was added to "/etc/llttab" in the above steps.

There you go we reached end of a hectic problem by a simple solution. The change to the link configuration is now permanent.



Tuesday, July 19, 2011

1-2-3 DHCP

            Today I just wanted to document here the quickest and easiest way of setting your Solaris box as a DHCP client. Setting up a Sun Solaris Server to receive its IP address from a DHCP Server is fairly straight forward and is infact just a matter of setting up files on your Solaris Server.

dhcpagent is the dhcp client thats running on your Solaris operating system. With this in place create the following file:


1. Create the /etc/hostname. <interface>

This will be an empty file

sun1# touch /etc/hostname.e1000g0
                                                                  

Repeat the above for all the avaliable interface cards on which you would like to have the IP from DHCP.


2. Create the /etc/dhcp.<interface>

This is also an empty file

sun1# touch /etc/dhcp.e1000g0  
                                                           

but you can also use it to specify how long ifconfig should be waiting for a DHCP-server reply before giving up and continuing with Solaris boot.

sun1# cat /etc/dhcp.e1000g0    
wait 60 ( default 30 sec)            ------> (Can be any value or forever)
primary                                                ------> (To notfy ifconfig about primary interface incase you have more tha one interface)


3. Specify your system name

It has to be specified in /etc/nodename. This name is going to be used as your hostname in case your DHCP-server does not return your hostname in reply to your DHCP-request.
                       
sun1# cat > /etc/nodename 
sun2                         


Reboot your system, and it all should work !!!

As simple as that, 1-2-3 DHCP





Appendix A:

Problem: Unknown hostname

            Actually, there's one snag: most (if not all) cable modem DHCP servers don't provide you with a hostname (even if they did, odds are it won't be one you want anyway!). This wouldn't be a problem, except that the boot scripts (/etc/init.d/rootusr in particular) try to be clever, and set your hostname to "unknown" in this case, which is not at all useful!

Solution: Add the hostname to /etc/nodename as mentioned earlier.


Appendix B: Tunable dhcpagent parameters.

Below is the file which needs to be modifed to have changes according to requirments.

                                                           
            /etc/default/dhcpagent
                                                             

Most important parameter value attribute is below.

PARAM_REQUEST_LIST=1,3,6,12,15,28,43

This variable tells dhcpagent what to request for from DHCP server.
Each value in the above list has specific meaning.

    1 = subnet mask
    3 = Default Router
    6 = DNS Server
    12 = hostname
    15 = DNS Domain Name
    28 = broadcast address
    43 = Encapsulated Vendor options

If you doesn't want the client to request for hostname delete number 12 followed by its comma (,) symbol

Saturday, July 16, 2011

Setting Up Solaris 10 Projects to Control Resource Usage

Overview

This is one of my favorite features of the Solaris 10 operating system. Nowadays, even in relatively small machines, the resources such as CPUs, memory, etc., allow the execution of a great number of processes in a single box. The control an administration can enforce on resource utilization is fundamental for the machine power to be fully utilized without jeopardizing the responsiveness of certain processes. Moreover, the usage profile of the resource may be a complex function which depends also of parameters such as time. Solaris 10, by default, would adapt itself dynamically so that all of the running applications have equal access to resources. The default behavior can be customized so that applications can access resources on a preferential basis or even be denied access under certain conditions.

This post will give a quick and basic description of resource management on the Solaris 10 operating system. As usual, the official documentation can be consulted on Sun Microsystems' Documentation Center.

Projects and Tasks

Projects and tasks are the basic entities which are used to identify workloads in the Solaris 10 operating system. A project is associated with a set of users and a set of groups. Users and groups can run its processes in the context of a project they're member of. Both users and groups can be members of more than one project so that the relations (project, user) and (project, group) are n to n relationships. The project is the basic entity against which the usage of resources can be restricted. The task is the entity to which a process is associated. The project, indeed, is associated with a set of tasks. Tasks will be described later on.

Default Projects

Every user and every group are associated to a default project, which is the project under which their processes are run if not differently specified. The algorithm that Solaris 10 uses when determining the default project of an user or of a group is the following:
  • it checks if there's an explicit association in the /etc/user_attr database by means of a project attribute. If there's one, that's the default project.
  • it checks if it exists a project named user.user-id or group.group-id, in this order, in the projects database. If there's one, that's the default project.
  • it checks if it exists the special default project in the projects database. If it does, that's the default project.

The Project Database

The projects database stores all the information related to existing projects in the operating system. The project database is a plain text file which can be read and modified by a set of commands such as:
  • projadd, to add projects
  • projmod, to modify projects
  • projects, to read the project database
  • projdel, to remove projects
The structure of the file is pretty simple:

project-name:project-id:comment:user-list:group-list:attributes


FieldDescription
project-name the name of the project which can contain only alphanumeric characters and the - and _ characters. The . character only can appear in default user's and default group's project.
project-idthe numerical id of the project, which can be a number between 0 and UID_MAX.
commenta string which describes the project
user-listthe list of users which can be member of this project whose syntax is described later
group-listthe list of groups which can be member of this project whose syntax is described later
attributesa set of attributes which apply to the project whose syntax is described later

Although the file syntax is very simple, this and many other Solaris configuration files should not be edited manually. Use the corresponding commands instead.

User and Group List

The list of users and groups in the projects database is a comma separated list of values which can be one of the following:
  • an user or group name
  • *, to allow all users and groups to join the project
  • !*, to allow nobody to join the project
  • !name, to disallow a specific user or group to join the project

Attributes

The set of attributes for a project is a semicolon (;) separated list of (key, value) pairs with the syntax:

key=value
Both the semicolon (;) and the colon (:) cannot be used in the value definition.

Task

Whenever a user log in into a project, a new task for that project is created and it contains the login process. The task is given its task id end every process launched in that login session will be associated with the task which owns the login process. The task is the basic workload entity and can be viewed as a process group, also. Indeed, many operations which are supported on process groups can also be executed on tasks. Commands which create tasks are the following:
  • login
  • su
  • newtask
  • cron
  • setproject

Determine the Current Project and Task

If you're wondering which project you're logged in into and which task you're running your processes under, you can check it with the id command:

$ id -p
uid=101(enrico) gid=10(staff) projid=10(custom-project)

I'm currently logged in as user enrico, member of group staff and I'm currently member of project custom-project.

The ps command can also show project and task information:

$ ps -o user,uid,pid,projid
USER UID PID PROJID
enrico 101 15274 10


Other Useful Commands

The Solaris 10 operating system has got a number of commands which can be used to view and manage processes using the project or the task they're member of.
  • pgrep -T | -J, to look for processes associated with the specified task, using the -T option, or the specified project, using the -J option
  • pkill -T | -J, to send a signal to the processes associated with the specified task, using the -T option, or the specified project, using the -J option
  • prstat -T | -J, to view process statistics with task statistics, using the -T option, or project statistics, using the -J option

Creating an User and a Project

Let's assume we want to cap the CPU utilization of a subset of well-known CPU-intensive processes. We do this because we're not concerned about the time these processes need to end their job but we're concerned in that at least 50% of the available CPUs on our workstation be free for other processes to run during normal system and user activity. The first thing we can do is creating a project for these processes and then run these processes in a task associated with such project. In our case, being the processes non-interactive, the easier path probably is creating a dedicated user for them and creating a default user project with the characteristics we need. This way we don't even have to bother creating a specific task manually to launch these processes: they'll be run in the default user project. Let's call the user custom-user and put it for simplicity's sake in the existing staff group:

# useradd -d /export/home/custom-user -m -k /etc/skel -P a-suitable-profile -G staff -c "custom-user" -s /bin/bash custom-user

Our user would probably need to be given a suitable profile for RBAC access control and this step will depend on your specific needs and environment. If this user needs no special RBAC configuration, we can omit the -P option. Other user related configuration activities are omitted from this example.

Once the user is created, we can create its default project using the user.user-id syntax:

# projadd -U custom-user -K "project.cpu-cap=(privileged,50,deny)" user.custom-user

Projects and Resource Caps

This command will create a project, named user.custom-user, that will be the default project for the user custom-user. This project has the following attribute:


project.cpu-cap=(privileged,50,deny)

This is one of the many resource controls that Solaris 10 puts at our disposal. For a complete list of resource control please read the official documentation. This resource control limits the amount of CPU available at the project level. It's a privileged operation and a value of 50 corresponds to the 50% of one CPU. When the limit is reached, the associated behavior is deny, which actually denies the project more CPU. The machine I'm running is two processors machine so all of the processes running under the user.custom-user project will have the 50% of one CPU, leaving free the remaining 50% of that CPU and the entire other CPU.

I use this project to execute long-running tasks which in reality aren't always expensive in terms of CPU, but with such configuration I can sit at my workstation, launch the processes I need in this project and let them run without worrying that the machine be left without CPU power to run the desktop and the application I'm using when logged in.

I configured many different projects for different classes of processes I run and their administration is pretty easy, once you're familiar with the resource controls available in Solaris 10. The mechanism of associating a default project to users and groups, moreover, is a quick way for an administrator to cap the resources used by a particular user, based on individual needs, or by a particular group. In this sense the word project really fits this kind of functionality: we have groups of people working at some projects with different resource needs each and Solaris 10 puts at our disposal a concept which really reflects the organization of our work.

Creating a Task and Moving Processes to Projects

As described in the previous section, the default project association is a quick means for administering the server resources. Our users, indeed, aren't even conscious that they're running their processes in such a resource capped environment.

Sometimes, however, you'll need to launch processes in different project or move processes from one project to another. For example, I haven't configured a project for myself as I'm not assigned a project during my working day. Sometimes, though, I need to execute processes in a particular project, as the long running processes I was describing earlier, or move an existing process from a project to another: this is the case if, for example, a process is unexpectedly consuming a larger amount of resources than I wish, and prefer to move it to a resource-capped project in order to improve the machine responsiveness.

If you need to create a task in a specific project, given that your user can join such project, can be done with the following command:

$ newtask -v -p my-project
25

This command creates a new task in the my-project project and the -v flag lets you see the new task generated for this task, which can be useful when used with other commands that accept this parameter. Launching the newtask command in your shell also has the side effect of putting your shell into the newly generated task, thus allowing you to immediately launch new process to be executed into this task.

Let's now suppose that you've detected a process which is consuming too much a resource and you prefer to move inside a properly capped project. The first thing you need is the process id of this process which can be obtained in a number of ways:

$ pgrep process-to-move
15257

Now that we have the process id of the process to be moved, we can create a new task in the desider project and move the process inside it. This can easily be done with just a one liner:

$ newtask -v -p my-project -c 15257
27

It's not necessary, but if you want to be sure that the process is now executing in the context of the desider project, you can use for example:

$ pgrep -T 27
15257

which confirms that the process number 15257 is executing inside the task 27.

Associating a SMF Managed Service with a Project

So far, we've seen the basic tools to define and manage Solaris 10 projects. The administrator is now able to:

  • Define projects.
  • Define resource caps for a projects.
  • Move processes to tasks and assign tasks to projects.
  • Assign a project to an user and/or a group of users.

ConclusionThis is a really short introductory walk through inside Solaris 10 capabilities of configuring and monitoring resource utilization. The official documentation describes the list of resource controls that Solaris 10 puts at the administrator's disposal and the complete set of commands that can be used to monitor and even change these parameters at runtime, when they could also be life-saviours. Knowing of their existence, at least, may one day ease your life.

Thursday, July 14, 2011

Set Title of XTERM window

The title of a XTERM window can be set using the following escape sequence:

ESC ] 0 ; title ^G

Example:

echo "^[]0;This is a title^G"


Wednesday, July 13, 2011

LINK AGGREGATION

Aggregation is the process of combining two or more NIC ports to have increased bandwidth and redundancy.

Pre-Requisites:

1. Ports need to be aggregated at hardware end ( Switch ) before performing Link aggregation at OS end.

2.
You need to have GLP enabled NIC cards and your OS should have local-mac-address= true
NOTE: non GLP enabled NICs can go for IPMP at a cost of extra test IPs.

3. Login via console.




# ifconfig -a
lo0: flags=2001000849 mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
nxge0: flags=1000843 mtu 1500 index 6
inet 192.168.106.247 netmask ffffff00 broadcast 192.168.106.255
ether 0:21:28:34:ad:22
#


# eeprom | grep mac
local-mac-address?=true
#


# dladm show-dev
nxge0 link: up speed: 1000 Mbps duplex: full
nxge1 link: up speed: 1000 Mbps duplex: full
nxge2 link: up speed: 1000 Mbps duplex: full
nxge3 link: up speed: 1000 Mbps duplex: full
#


# dladm show-aggr
No output would be displayed as no aggrigations avaliable.


# ifconfig nxge0 down unplumb


# dladm create-aggr -d nxge0 -d nxge1 -d nxge2 -d nxge3 1
One (1) at last represents the Key, upon which the aggr device is formed [ range: 1-999, noe zero (0) ]


# ls -l /dev/aggr1
lrwxrwxrwx 1 root root 30 May 31 02:59 /dev/aggr1 -> ../devices/pseudo/aggr@0:aggr1
#
This device is formed after create-aggr


# dladm show-aggr
key: 1 (0x0001) policy: L4 address: 0:21:28:34:ad:22 (auto)
device address speed duplex link state
nxge0 0:21:28:34:ad:22 1000 Mbps full up attached
nxge1 0:21:28:34:ad:23 1000 Mbps full up attached
nxge2 0:21:28:34:ad:24 1000 Mbps full up attached
nxge3 0:21:28:34:ad:25 1000 Mbps full up attached
#


# ifconfig aggr1 plumb
May 31 03:25:47 test1 last message repeated 1 time
May 31 03:25:47 test1 nxge: NOTICE: nxge2: xcvr addr:0x1b - link is up 1000 Mbps full duplex
May 31 03:25:48 test1 nxge: NOTICE: nxge3: xcvr addr:0x1a - link is up 1000 Mbps full duplex
May 31 03:25:48 teset1 nxge: NOTICE: nxge1: xcvr addr:0x1c - link is up 1000 Mbps full duplex
#


# ifconfig aggr1 192.168.106.247 netmask 255.255.255.0 up


# ping 192.168.106.1
192.168.106.1 is alive


# cat > /etc/hostname.aggr1
192.168.106.247
#