VCDX: Some thoughts on requirements

What is this about?

This blog post has not the intend to define what a requirement/constraint is, there are already very good posts out there. Lately I pointed a lot of people over to Jeffrey Kusters who did an excellent job of summarising it but I included also two other good links:

What is this about?

If you are like me, coming from a pure technical background, the conceptual model of the VCDX exam proves to be the hardest part – especially since your journey starts with it and there is no shortcut around it (Rene gives good advice, as always https://vcdx133.com/2014/07/09/vcdx-how-do-i-measure-if-my-customer-requirements-are-being-met/)).  Think of the conceptual model like the foundation of a house, if it is not solid everything you build on top of it will collapse eventually (btw: It happened to me, forcing me to re-write on more than one time. Pro tip: don’t be like me on this).

Summing it up: Ideally this post forces you to realise that you need to invest time into the process of learning on how to develop a conceptual model and, with as the post has a focus on requirements, learn that they do not come out of thin air. There is actually a whole field called “requirement engineering” which has the target to gather and formulate requirements – just to give you a feeling for the relevance of this topic.

Requirement engineers have methods and techniques which you can study. Use this to build an understanding of what is relevant and how it is done. Try to apply this by formulating some solid requirements for your VCDX process (and keep that knowledge for your future projects).

Where do I start?

You know, there is always google ūüôā

Seriously, (solid) requirements are needed in a lot of places, one of my favorite reads is provided by the NASA.
They have a whole book online, for free – start looking at chapter 4, the System Design process: https://www.nasa.gov/connect/ebooks/nasa-systems-engineering-handbook

Do you need to read all of this and how does this all apply to VCDX?

Heck, no you don’t need all of this! But start digging into it and you learn some good stuff – they key here is building an understanding to know why requirements are important. For instance, I like the “TABLE 4.2-1 Benefits of Well-Written Requirements“.

Also, did you consider to try to talk to people who deal with requirements on a daily basis? Do you know any project manager or software developer/architect? They might be more than happy to help you out.

Can you sum it up, what does it mean for my VCDX document?

I cannot give you a definitive answer but a few personal opinions:

  • Write a requirement like the stakeholder investing money, not like the tech nerd you are (I include myself here).
  • Don’t focus on the implementation and do not make a hidden design decision out of a requirement: Focus on what the system/infrastructure needs to achieve, not how.
  • Did you test if other people understand your requirement? Ask around, also among non-technical people. Does everybody expects the same when reading your requirement?
  • For the majority of requirements, do not use subjective adjectives, e.g. what do you mean by fast storage? People might have different opinions on that.
  • Going in the same direction as the bullet point above, can you validate your requirement in any way? (Yes, this is one reason why there is a validation plan in the VCDX)
  • Be specific, set scope and expectations: Like when you include growth in percent, is it measured from your baseline or a “year over year”-value? For how many years do you need to plan? Which areas (compute, storage, …) do you need to consider?
  • Avoid any mis-interpretation with negative requirements, e.g. must not do X or Y. The “not” might be easily overlooked and there is still room for the question what the design must do.

On the topic of how much meta-data a requirement needs, I had a table with the following information:

  • Unique ID: Allows you reference the requirement in your design
  • Description: The main matter of a requirement.
  • Design quality: More for my sake to ensure I got everything covered
  • Issuer: Who signed off on the money going into this requirement?

I won’t say it is perfect but it did the job and it may be a good starting point if you haven’t considered anything in this regard.

The end

This is not much but I hope it points candidates into the right direction. I am always open for discussion and feedback, hit me on twitter if you like!

Disclaimer: Honestly, I feel like an imposter for writing this, constantly debating with myself if I can dare to put this out into the wild as I feel that my own stuff was not stellar. However, with some support from Bilal and Chris I decided to go for it. After all, it is a topic most candidates struggle with and I was no exception.

VCDX basic skills: the whiteboard

Disclaimer: I am by no means an expert on this topic nor have I got the feeling that my performance during the VCDX defense was exceptional. However, some basic training got me from “zero” to at least one step further – more¬†or¬†less¬†readable¬†handwriting.

We all take it for granted – there is a whiteboard in the room and we all know how it works, right? In any kind of meeting, when there is an idea in the room, you need to take notes or need to visualize something – you start drawing and writing on the whiteboard. Then one of the next sentences is often “sorry for my handwriting” or “I hope you can read this”.

I was in the same position and it felt somehow “unprofessional”. In the weeks before my VCDX defense with the design scenario looming over my head (which relies heavily on the whiteboard) I wondered if I cannot learn or improve my skill set in this regard.

Guess what – thank you, internet! There is nothing you cannot find on the internet, you just need to know for what to look for.

The tool: Whiteboard marker

Another thing we do tend to ignore are the whiteboard markers (except when they are dried out, then we start swearing). But did you ever wonder why there are different tips? In the picture you see the chisel-tip (top) and the bullet/round-tip (bottom).

Different tips serve a purpose

Chisel-tip: Often between 2 and 5 mm. The best use-case for this marker is writing. Vertical lines are slender and horizontal lines are bold. This makes your writing good readable, especially from further away

Bullet/round-tip: Around 2 mm. Use this for drawing. Your line width will not differ by direction and you can do more details.

Chisel on top, bullet-tip bottom.
  • VCDX-tip #1: Bring your own markers into the room. You do not want to deal with a dried-out pen
  • VCDX-tip #2: You won’t be able to switch between different marker-tips in the room, time is of essence and you don’t want to start fumbling around. Just practice and stick with one type of marker. Which one you need to find out for yourself.

If you want to go to expert level on different brands, see here. Personally I didn’t want to spent a fortune and the box that comes with the Staedtler pens is quite handy.

The overlooked: Handwriting

If you can spare a few minutes, read this blog post by Yuri Malishenko for a great overview on how to improve. For me this was really the starting point for all further efforts.

If you want to have a TL, DR at this point: Learn to write in CAPITAL BLOCK LETTERS, no one can read the stuff you learned in school. Now you do not have a whiteboard at home you say?

Flashcards can be used to improve your writing and to help you remembering information

I got back into the habit of writing flash cards. With the right pens (in my case 1 – 2,5 mm chisel-tip pens) you can practice your capital block letter-writing and at the same time memorize important information.

Another aspect is the size of your handwriting, again referring to an excellent article by Yuri Malishenko – use rule #5 to determine the size.

TL, DR: As a general rule of thumb, write a letter about 3 cm high. That is as high as the index and middle-finger put together. E.g. if you write with the right-hand, use the fingers of the left hand to determine the size of your letters.

Often forgotten: T-T-T

A golden rule of whiteboarding or on the flipchart is:

  • touch
  • turn
  • talk

Please, do not talk to the whiteboard. Now repeat: Do not talk to the whiteboard!

Touch: When you are finished writing or drawing, touch the point on the whiteboard you want to highlight/talk about

Turn: Turn to the audience and make sure you got their attention

Talk: Now you can start explaining.

Summary

All of this takes some time to learn and personally, I keep improving all the time. Perhaps this blog might help you preparing for VCDX by giving you some confidence in an area you did not consider so far.

VCDX: Thank YOU

On December 13th, 2018 I received an email: 

For me this marks the end of a chapter which I would call the VCDX journey and I have to thank many, many people for supporting me along the way up to this point.

Easily the longest supporter of my efforts has been Bilal Ahmed. In his always good natured way he managed to guide me since VCAP design-days with solid advice and motivation. Alone in the last weeks before the defense he did go out of his way to connect me with mock panelists so I could refine my presentation over-and-over in the final days before going in.

Yet another important person is David Pasek whom I contacted right after joining VMware to ask if he would mentor me. I guess without him I would still be editing and redoing my document. David is not only a sheer endless source of knowledge but has also the great gift of cutting through the noise and focus on the important parts, always able to get me back on track.

Also, thanks a lot all to other VCDX mentors who helped me along the way, like Paul Meehan, who is not only a great guy but has tons of knowledge to share and always motivated me to keep pushing. Paul McSharry who, before becoming a panelist, did a ton for the VCDX community. Per Thorn had always high quality and in-depth answers. Gregg Roberston for doing all the work with in-person mocks in the UK as well as the slack channel which both are vital for future VCDX candidates. Update: Damn, I forgot to mention Ben Mayer, in the time after submitting the docs he helped me with multiple scenario-sessions and valuable advice for the presentation.

A special thanks to Manny Sidhu for getting up at 5 a.m. (!) to attend one of my mocks and many more who donated their spare time (like Shady). The “closing call”, the last mock defense I had, was actually only about 12 hours before going into the room featuring a panel of Kiran Reid, Jason Grierson, Bilal and the future VCDX #273, Kenneth Fingerlos (to be honest, this session left me a bit shaken but it was great with some valuable lessons).

Also, here is one shout out to my favorite slack group with guys like Bilal, Kyle Jenner Chris Porter and Mat Jovanovic. It is always a great mixture of banter and solid knowledge exchange with you guys. Chris also organised a mock session during VMworld Barcelona which was a dire-needed wake-up call for me to get on with my presentation (thanks everyone who attended that session in BCN).

During all the time I was fortunate enough to have support from my employers (current and past). At VMware from Matthias Diekert, who without batting an eyelid, offered me full support by taking over travel and expenses. At my former workplace the CEO and team lead supported my efforts, too.

Last but not least … the family. Man, they say you can’t do VCDX without the family and they are right. With a second kid in late 2017 and a job change in mid 2018, VCDX was no fun in the spare time, often leaving me only the hours between 10 p.m. and 1/2 a.m. for my work. My partner supported my all the time, either by “kicking my ass” to get up and start writing/studying again or by taking the kids for a weekend out on the days before the deadline ended – just so I could work all day (and night) to finish it.

What’s next?

VCDX was a time and resource-intensive process, at least for me. Getting back to a more normal work/life balance with the notion of picking up some sports again is one of my goals for 2019.

But being in IT, you cannot stay in one spot and from a professional perspective I fell behind on my training schedule (yes, I keep one for myself as part of the goals I want to reach. If you don’t do this, perhaps Melissa might change your mind). Next priorities are to catch up with public cloud, some sort of automation and also very specific with NSX-T.

Perhaps I make it to a VMUG and find a topic to present. I always wanted to do it but so far I do not know what to talk about. Also some more blog posts wouldn’t harm, so there is another thing to do.

Recovering the VCSA on a vSAN cluster

Disclaimer: The credit for the answer goes to John Nicholson (http://thenicholson.com/) a.k.a. lost_signal from the VMware SABU and I added some points.

As I am going through my physical design decisions, I came across a simple question for which I couldn’t find an immediate answer:

How can I restore my vCenter instance (VCSA) if I put in on the very same cluster it is supposed to manage? Can I restore directly on vSAN via an ESXi host?

As my google-Fu let me down, it was time to start a discussion on reddit:

vSAN question: Restore VCSA on vSAN from vmware

 

TL,DR: The good news is: Yes, you can recovery it directly and with 6.6. vSAN clusters this is straightforward with no prerequisites. Look into the vSAN Multicast Removal-guide for the post-processing steps.

As there are other aspects you generally need to consider (not only for vSAN),  I decided to summarize some basic points  (for 6.6 and onward clusters):

  • First things first, make a backup of your VCSA on a regular schedule along with your recovery objectives.
    • If you are on vSAN you should look for SPBM support in your selected product: the good¬†if you have support, the bad¬†if you don’t have it
  • Create ephemeral port groups¬†as recovery options for the VCSA and vSAN portgroups
    • This is not vSAN specific but should be generally considered when you have the vCenter on the same vDS it manages
  • Make a backup of your vDS on a regular basis (or at least after changes)
  • Export your storage policies
    • Either for fallback in case you make accidental changes or for reference/auditing purposes
    • You might need them in case you are ever forced to rebuild the vCenter from scratch
  • John pointed out that a backup product with “boot from backup”¬†capability (e.g. Veeam Instant restore) doesn’t need raise the initial question at all as an additional (NFS) datastore is mounted.
    • A point from myself: Verify the impact of NIOC settings if you followed the recommended shares in the vSAN guide for the vDS. The NFS mount uses the management network-VMK interface which is quite restricted (note: that this would only apply if you have bandwidth congestion anyway).

I would be more than happy if anyone is willing to contribute to this.

When you are using SPBM but the rest of the world is not (vSAN)

Today I came across an issue I did not immediately think about selecting a data protection or replication solution for a vSAN deployment:

Let us say we have a vSAN datastore as target for a replication (failover target) or a data restore from backup. But what if your data protection or disaster recovery/replication product does not support storage policies?

You might find yourself facing some unexpected problems.

The restore or failover might succeed but your VM files (including VMDKs) are subsequently protected with the vSAN default policy. If you did not modify it, this will result in FTT=1 and FTM=RAID1 (If you are not familiar with FTT and FTM, search for in conjunction with vSAN).

At first glance, this does not look too bad, does it?

But…
Now what if the source VM was protected with FTT=2 and FTM=RAID6?
The restored VM has now less protection with more space consumption and the VM might not even fit on the datastore, even if the clusters are setup identically or even it is the same cluster (in case of a restore).

Example:

A VM with a 100GB disk is consuming 150GB  at the source vSAN datastore (with FTT=2 and FTM=RAID6 ) and is able to withstand two host failures. However, it would consume 200GB at the destination datastore (with FTT=1 and FTM=RAID1) as the latter would create two full copies and only one host failure can be mitigated.

Sure you could modify the default policy for this, but what if you have different settings? The beauty of SPBM lies in the fact that you can apply it per disk and re-applying the policy settings for a more complex setup will become messy and error prone.

Now if you ask me for a good example on how to do it:

Veeam shows how to integrate this here.

VMware offers a storage policy mapping in SRM

VCSA 6.5 U1: vAPI status “yellow” and content library not started (possible fix)

As this is an error that affected me now at multiple customer installations, it is time for a blog ūüôā

After upgrading a site to 6.5 U1 I noticed several issues:

  • The vAPI Endpoint status changed to “yellow”
  • The Content Library service would not start

The only resolve I found within the VMware KB was “restart the services.

As I didn’t help, I searched along and found VMware KB¬†2151085¬†with the cause of the error

 

The ts-config.properties file is deployed with the noreplace option. With this option, the ts-config.properties will no longer be overwritten, instead it is saved with the extension .rpmnew.
and a nice hint
This is a known issue seen with several upgrade paths to vSphere 6.5 Update 1. Not all upgrade paths are affected, VMware is investigating affected paths this article will be updated once confirmed.
I hope this helps someone else as the KB entry isn’t obvious.

Fun with vSAN, Fujitsu and (LSI) Broadcom SAS3008

Update 2015-08-07:

Found a newer version of the file with the name P15 ¬†containing version 16.00.00 (right…) of the sas3flash utility here

This one works fine:

 

——-

Today was “one of those days”:

A simple vSAN installation turned into a nightmare of reboots, downloads, google searches and so on.

The problem at hand was a cluster of Fujitsu RX2540M2 vSAN nodes with CP 400i controllers (based on LSI/Broadcom SAS3008 chipset) and vSAN 6.5U1.

For vSAN 6.5 U1 the HCL requires the controller  firmware version 13.00.00, but Fujitsu delivered it with version 11.00.00.

Easy enough, the plan looked like this:

  1. Download the sas3flash tool for VMware from Broadcom here
  2. Copy the vip and install it via esxcli
  3. Download and extract the firmware (with 7zip you can open the .ima file) from Fujitsu here
  4. Flash the Controller online with /opt/lsi/bin/sas3flash

Unfortunately , the controller did not show up with sas3flash and I was not able to find the controller with the tool, not matter what I did.

Then the fun began, I tried:

  • Online patching via lifecycle management: Did not find any updates
  • The Fujitsu ServerView Update DVD: Wouldn’t find any CP400i or any required update
  • The USB stick from Fujitsu with the firmware (see link above): Wasn’t able to find the controller when booting into DOS mode
  • Get into the UEFI shell: Isn’t that easy with Fujitsu anymore: ¬†read here

In the end, I gave up:

I pulled the controllers from the server, fitted them into a UEFI capable workstation and flash them with the Fujitsu USB stick from the UEFI shell. Worked like a charm and took my like 5 minutes. If your USB stick is mounted on fs0, the command is like this:

sas3flash -f fs0:lx4hbad0.fw -b b83100.rom -b e150000.rom

or you could simply use the force.nsh script ūüôā

After a refitting the controllers in the server and rebooting, sas3flash on esxi still reports no controllers.

I am at a total loss why this happens. A few weeks ago this worked totally fine with a similar setup at another customers.

I am curious to know why this didn’t work and how to fix it.

Any updates will be posted here.

Some notes about Oracle Backup with Veeam

Veeam introduced the oracle backup feature with version 9 and a¬†lot of people are very excited about this, because oracle backups were one of the last bastions for the¬† “old guard” of enterprise backup products.

I was facing a bit of a problem here, since Oracle DB are definitively not my turf but now it is expected that I should be able to handle those in a backup/recovery case.

So, the goal of this post is to establish some basic concepts about oracles DBMS and share some notes how Veeam backup behaves when backing up oracle databases (perhaps this might interest folks who can compare this to RMAN).  Feel free to correct me, as stated above this is not my usual business.

 

Requirements

Taken directly from the Veeam documentation center, these are the requirements for enabling log shipping.

  • Veeam Backup & Replication supports archived logs backup and restore for Oracle database version 11 and later. The Oracle database may run on a Microsoft Windows VM or Linux VM.
  • Automatic Storage Management (ASM) is supported for Oracle 11 and later.
  • Oracle Express Databases are supported if running on Microsoft Windows machines only.
  • The database must run in the ARCHIVELOG mode.

So, that is that for version requirements – OK, but what is an archive log?

Oracle Logging

Redo Logs

Knowing the usual pieces ¬†MSSQL I am familiar with transaction logs, but within oracle these are called “redo logs”. Unlike¬†the t-log, you need to have at least two redo logs (quotes are from the oracle DB manual):

The redo log of a database consists of two or more redo log files. The database requires a minimum of two files to guarantee that one is always available for writing while the other is being archived.

Of these (at least two, more likely three) only one is written into:

LGWR (log writer) writes to redo log files in a circular fashion. When the current redo log file fills, LGWR begins writing to the next available redo log file. When the last available redo log file is filled, LGWR returns to the first redo log file and writes to it, starting the cycle again.

Notice that all logs will be overwritten at some point in time, i.e. when the log files are filled up. This would be more like the SIMPLE log model in MSSQL.

And finally some nomenclature to distinguish those logs who are being written to and those that are currently not in use:

Oracle Database uses only one redo log files at a time to store redo records written from the redo log buffer. The redo log file that LGWR is actively writing to is called the current redo log file.

Redo log files that are required for instance recovery are called active redo log files. Redo log files that are no longer required for instance recovery are called inactive redo log files.

Archived Redo Logs

I stated that we will lose the content of the logs at some point in time, the solution for keeping the logs is the ARCHIVELOG mode.

An archive log is an “archived redo log” and it will lead to us the equivalent of a FULL recovery model in MSSQL.

I am borrowing the following explanation from Oracle Terminology for the SQL Server DBA

These are redo log files that have been backed up. There are a number of ways to have Oracle automatically manage creating backups of redo log files that vary from manual to completely automated.

If the disks storing these files fills up, Oracle will not be able to write to the data files ‚Äď active redo log files can‚Äôt be archived any more. To ensure safety, writes are stopped.

The last paragraph is an important one, we will come back to this later on.

 

 

Veeam with Oracle backup

 

At this point I will trust that you know how to setup a Veeam job, perhaps I find time to add this more basic stuff later on.

Most people will use a 24 hour image backup schedule for their VMs. ¬†For reduction of RPO you would implement the “log shipping” within the application aware backup tab.

Note 1: Veeam does not use RMAN (Recovery Manager) but rather the Oracle Call Interface (OCI).

Note 2: Veeam will only delete log files will the next image backup (full or incremental), not with the log shipping.

Note 3: Veeam doesn’t delete empty log folders after clearing *.arc files

Note 4: Veeam will issue a (global) log switch as part of the backup. This archives all online redo log files, meaning that you need space on your archive log partition.

So, with note 4 in mind, image a situation where ¬†your archive partition is quite full and you want a backup to clear it up. This might actually lead to more used space before anything can be cleared up. Also, you cannot use Veeam to clear up an already full partition. Right now, there doesn’t seem to be an option to change this.

Note 5: Behavior of Delete logs older than <N> hours or Delete logs over <N> GB may not be so obvious.

First, this will only take effect when you run a full/incremental backup (see note 2).¬†If you let the default “delete after 24 hours” and make an image backup once a¬†day, then you will always have nearly two days worth of logs on your disk: The last day and the current logs.

Deploying the IBM StorWize / SVC IP quorum

*Update: Thanks to Daniel Huber for making some corrections. He found an error in the systemd script as well as my Java/JDK stuff

A while ago, late¬†2015, IBM announced “IP quorum” support with version 7.6.

I didn’t have the need or opportunity to play around with this, but after a¬†colleague just implemented an installation with an IP quorum I wanted to try it for myself. The storage system part is fairly easy but they leave it up to you how to implement the Linux part.

A bit of technical background

A quorum device is needed in IBM SVC “streched” or StorWize “HyperSwap” installation and acts as a “tie breaker” placed at an independent location outside your two main data centers.

The traditional quorum device for StorWize/SVC is an “extended quorum” qualified fibre channel array. Usually either the FC connectivity or the storage array (cost, rackspace, heat dispersion) at the third site become a problem. So, here comes the¬†IP quorums to save the day.

The IP quorum consists of a simple java application that needs to be executed on a Linux server. However, IBM has narrowed the choice for your deployment considerably:

I don’t want to go into possible reasons why they do this, but if you are in a production environment make sure you fulfill the requirements.

Note that there is a gotcha with using the IP quorum (taken from the IBM documentation center):

Unlike quorum disks, all IP quorum applications must be reconfigured and redeployed to hosts when certain aspects of the system configuration change.

Now this is serious stuff, the “good old” FC quorum is like a fire and forget solution which you don’t need to touch after the setup.

Reading further on you’ll find some more information about the network quality required,¬†in a typical metro/campus installation this should be of no concern but make sure you check this:

  • The maximum round-trip delay must not exceed 80 milliseconds (ms), which means 40 ms each direction.

  • A minimum bandwidth of 2 megabytes per second is guaranteed for node-to-quorum traffic.

If you are worried about the availability:

The maximum number of IP quorum applications that can be deployed is five.

Be grateful for this. You want to patch your Linux¬†on a regular base, don’t you? And an 1U rack server or workstation offers not the same level of redundancy a storage system does. On the downside, if you have more than one installation you’ll have to redeploy the quorum to multiple locations¬†after a major configuration change (see above).

Personal comment time:

Using an IP quorum looks like a nice thing, there is no useless storage array sitting in a third location, less costs and so on. However, make sure you know the implications for your storage environment. This is a kind of paradigm shift since you give a key part of your availability solution out of your hands and rely on network and servers. Organisation wise this might include working with different teams, so change management becomes crucial.

Make sure you know how the network behaves in a site disaster so that the quorum stays available, the only time your ip quorum is really needed it should stay available. Test this on a regular base. Make sure the server team doesn’t patch all your quorum servers at once and test them after updating.

Time for some deployment:

For my tests I am using a IBM StorWize V7000 generation 1.

Since all models (including SVC) share the same code base this howto shoud be valid for all platforms.

As you can see I am using the latest and greatest code level, version 7.8.

Creating the quorum apllication

The latest version gives me some advantages, including the fact that you can deploy and monitor the IP quorum from the UI:

Since my installation uses IPv4 only (shame on me), the option for the IPv6 download is greyed out. And yes, I am using the “superuser” account for this demonstration (I guess that’s the second mark).

As you can see in the screenshot you can do this in the command line by issuing

  • mkquorumapp 

The application jar file is created in the /dumps directory, since you are already using the CLI I assume you know how to retrieve it from there with scp.

On the UI your browser will offer you a download, save it and transfer it to your quorum server.

I am using a VM based on RHEL 7.3, for a real life deployment this might be an option if you have an ESXi host in your third location. Otherwise a simple 1U rack server or even some kind of workstation might be a good option.

Deploying the quorum application

At this point you should have created the ip quorum application and installed a server with a supported Linux OS. At the end you’ll find a short summary of all commands, but here is a walk-through to understand the process.

I am not going to run an application as root if I do not have to, so creating a service user might be a good idea. Mine is named “ip-quorum”

My quorum application will be deployed into /usr/local/bin/ip_quorum and therefore I need to create the directory and change permissions accordingly:

Time to install Java, but not any Java – you’ll need IBM Java.

As Daniel pointed out I mentioned IBM Java but drifted off and used OpenJDK. Sorry for the error. As you can see it works but it will not be officially supported by IBM. Get your IBM Java here.


As I haven’t time to update the post with new instructions please beware that the next two steps describe OpenJDK instead of IBM Java:

Since I do not have a working satellite installation I’ll get mine straight from the only repositories. Have a look at the RedHat KB for instructions regarding satellite or RHEL6.

For SLES and RHEL you’ll need a working subscription if you are going to do it this way.


After the installation you might want to check the status

and start the applications once by running

java -jar /usr/local/bin/ip_quorum/ip-quorum.jar

As you can see the quorum application initiates a connections to your storage array, not the other way around as a normal service would. The target port on the StorWize is 1260/tcp if you need to implement a firewall rule.

Go back to your storage system to check the status:

Looks good, we made sure our quorum can reach the storage system on the network and application layer. Now terminate the application with CTRL-C .

Setting up a service for the quorum application

Since we want our quorum to start and stop with the server we need some kind of auto start implementation. There are multiple ways to achieve this, I choose a systemd service definition which I call “ip-quorum”.

Create the file with in the system folder of systemd

and add your content:

[Unit]
Description=IBM Storwize IBM Quorum Service

[Service]
# Typo corrected by Daniel, it is Type with a capital letter. 
Type=simple
ExecStart=/usr/bin/java -jar /usr/local/bin/ip_quorum/ip_quorum.jar
StandardOutput=null
User=ip-quorum
Group=ip-quorum
WorkingDirectory=/usr/local/bin/ip_quorum/

[Install]
WantedBy=multi-user.target

Make sure you tell systemd that unit files have changed

systemctl daemon-reload

Enable and start the service, do a quick check with “netstat” and on the storage system UI to see if the stuff is running

TL,DR:

Here are the lines required, please make sure to check these, fill in the needed values and do not paste them blindly into your system

# creating the user and files

useradd -r ip-quorum
mkdir -p /usr/local/bin/ip_quorum
cp (SOURCE)/ip_quorum.jar /usr/local/bin/ip_quorum
chown -R ip-quorum:ip-quorum /usr/local/bin/ip_quorum
cd /usr/local/bin/ip_quorum
chmod 774 ip_quorum.jar

# install java
subscription-manager repos --enable rhel-7-server-supplementary-rpms
yum install -y java-1.8.0-openjdk-headless

# create service definition
cat <<'EOF' >> /lib/systemd/system/ip-quorum.service
[Unit]
Description=IBM Storwize IBM Quorum Service

[Service]
Type=simple
ExecStart=/usr/bin/java -jar /usr/local/bin/ip_quorum/ip_quorum.jar
StandardOutput=null
User=ip-quorum
Group=ip-quorum
WorkingDirectory=/usr/local/bin/ip_quorum/

[Install]
WantedBy=multi-user.target
EOF

# start and enable the service 
systemctl daemon-reload
systemctl enable ip-quorum
systemctl start ip-quorum