Security Benefits and Liabilities of Virtualization Technology

This paper provides a broad discussion of the security issues related to virtualization technology, such as the offerings by VMware, Microsoft and IBM. It presents an overview of virtualization, the various types of virtualization, and a detailed discussion of full computer virtualization technology. The benefits of virtualization technology are provided from a position of security, convenience and cost. The paper continues with a discussion of the security liabilities of virtualization. It provides examples of recent attempts by security researchers to design attacks directed at the virtual machine manager also known as the hypervisor. A look at trends in the application of virtualization technology concludes the discussion. Virtualization is a type of abstraction of resources. In computer technology, virtualization can be used to simulate the presence of memory, disk, video or entire computers where they exist partially or not at all. The first virtualization technology dates back into 1960, when IBM and other computing pioneers created operating systems and storage systems that presented an isolated environment to the user that appeared as a single-user system. Today our desktop operating systems use memory virtualization to provide a larger runtime space for applications than there is random access memory. Our operating system uses a combination of solid-state memory and a paging file on disk to move data blocks between to two media depending on their frequency of use. Enterprise storage virtualization, such as solutions provided by IBM, EMC and Sun create an illusion of massive consolidated storage space available from solid-state, magnetic disk and streaming tape into a single logical direct access image. Less frequently accessed data blocks are migrated to slower media while often-accessed data blocks are maintained on faster access media. All storage appears online and ready to access. The recent the popularity of virtual machines for running Java and .NET software allow a common runtime environment regardless of the actual hardware and operating system hosting the virtual machine. This approach reduces the work required by the software provider to create a solution capable of running on a variety of platforms. Cardwell (2007) defines computer virtualization as a computer within a computer. Virtualization software simulates a computer, including the processor and hardware components, and BIOS to the guest operating system. The guest operating system running within the virtualized environment should not know or care that its hardware resources are not physical resources, but instead simulated through software. The two types of computer virtualization are called full virtualization and para-virtualization. Wong (2005) discusses the differences of full virtualization and para-virtualization. Full virtualization does not require changes to the guest operating system. Products such as VMware provide full virtualization. This type of virtualization requires support in the host system’s processor to trap and help emulate privileged instructions executed by the guest operating system. Para-virtualization requires modifications to the guest OS to run on the virtual machine manager. Open source operating systems, such as Linux can be modified to support a para-virtualized environment. This type of virtualization often performs better than full virtualization, but is restricted to guest operating systems that have been modified to run in this specific environment. Today there are many popular, contemporary and affordable virtualization products on the market. VMware is the most widely known, but IBM has the longest history with virtualization technologies. As mentioned previously, virtualization for mainframe systems dates back to 1960. VMware has targeted Intel platform virtualization since the 1990s. Microsoft acquired Virtual PC as the market for virtualization grew from VMware’s popularity. Xen is an open source virtualization solution. Xen supports full and para-virtualized systems. It is popular with Linux distributions, which often provide para-virtualized kernels ready to deploy as guest operating systems. IBM’s two primary virtualization platforms are the System-z mainframe and Power systems. “The latest version of z/VM […] will now support up to 32 processors and offer users 128 GB of memory, which will allow the software to host more than 1,000 virtual […] Linux servers.” (Ferguson, 2007). Virtualization technology, which was originally used on centralized systems to share resources and provide a partitioned view to a single user, is popular on server and workstation platforms running Intel x86 hardware. Cardwell (2007) presents several use cases of virtualization benefits, including consolidation of servers, quick enterprise solutions, software development, and sales demonstrations. Separate physical servers running periodically accessed services can be virtualized and run together on a single physical system. Short-lived server systems, such as those for conferences, could be created as virtual machines without the need for acquiring physical servers to host the solution. Software developers often need multiple systems to develop server-based solutions, or they require several versions tools that may conflict when installed together. Sales demonstrations can be configured and distributed to customer-facing staff as virtual machines. Many different configurations can be created and options demonstrated to customers on demand to see how various solutions can apply to their environment. As processing capability increases on the desktop and virtualization providers offer cost-effective software to create virtualized environments, this is a primary growth area for the technology. Burt (2006) says the benefit of mobility of virtual machines for users is a huge benefit of desktop virtualization. Virtual machines stored on portable media such as USB hard disks or flash storage. They can be paused on a host system at an office, taken on plane to the customer’s location and then resumed on a new host. This can happen while keeping the virtualized operating system completely oblivious to its actual location and host hardware. Testing and quality assurance has had large adoption of virtualization technology. According to Tiller (2006), the benefits of virtualization include the ability to react and test vulnerabilities and patches in a much shorter timeframe. Single virtualized systems can be dedicated to an individual task in a network of systems. Upgrading or relocating any virtualized system can be performed without affecting other parts of the entire solution. There is a large benefit to security and availability with virtualization technology. Virtual machines are separated from the host operating system. Viruses, malware and software defects that affect the virtualized system are restricted and, in most cases cannot spread to the host operating system. Disaster recovery planning has the potential for simplification under a virtualized infrastructure. Virtual machines images, such as those used by VMware, are stored on the host operating system as files. Backing-up or relocating virtual machines from one host to another can be as simple as suspending the running virtual machine, moving the set of files across the network and resuming the virtual machine. Virtual machine images can be shortly suspended and stored to tape or mirrored to a remote location as a disaster recovery process. Duntemann (2005) points out that a virtual machine with the operating system and installed applications are commonly stored as disk files and can be archived, distributed, or restored to an initial state using the virtual machine manager. These files are also subject to attack and potential modification if the host system is compromised. A successful attack against the host system can make the virtual machines vulnerable to modification or other penetration. Virtualization is also known as a system multiplier technology. “It is very likely that IT managers will have to increase the number and expertise of security personnel devoted to security policy creation and maintenance as the percentage of VMs increase in the data center.” (Sturdevant, 2008). Where a virus would previously attack a single operating system running on a physical host, a virus can land on the host or any of its virtualized guests. The potential of creating an army of infected systems is possible now with just a single physical host. A Windows operating system running in a virtual machine is just as vulnerable to flaws and exploits as the same operating system running on a physical host. “At a broad level, virtualized environments require the same physical and network security precautions as any non-virtualized IT resource.” (Peterson, 2007). “[…] because of the rush to adopt virtualization for server consolidation, many security issues are overlooked and best practices are not applied.” There are fundamental problems for IT administrators adopting virtualization technology within their labs and data centers. Products such as VMware have internal virtual networks that exist only within the host system. This network allows the virtualized systems and the host to communicate without having the use the external, physical network. The difficulty is that monitoring the internal, virtual network requires the installation of tools that are designed for virtualized systems. Edwards (2009) points out the need for management tools to monitor communication among virtual machines and their host operating system in detail. Each host would require monitoring tools versus a single installation on a network of only physical systems. Discovery and management of virtualized systems will place more burdens on IT staff according to Tiller (2006). The ease with which virtual machines can be instantiated, relocated and destroyed will require a “quantum shift in security strategy and willingness to adapt." As the popularity of virtualization on a smaller scale has increased, a new class of attack on virtual machines and their host virtual machine managers has received more attention. Virtual machines have unique hardware signatures that can be used to identify them and help an attacker tailor an exploit. “As it is, virtualization vendors have some work to do to protect virtual machine instances from being discovered as virtual.” (Yager, 2006). The CPU model and various device drivers loaded by the operating system can identify a virtualized system. In fact, many virtualization vendors supply device drivers for guest operating systems to take better advantage of the virtualized environment. These device drivers are just as susceptible to flaws and vulnerabilities as their non-virtualized counterparts are. The host virtual machine managers, also known as hypervisors are being targeted as well by new types of attacks. Vijayan (2007) points out that dedicated hypervisors, running directly above the hardware of a computer can be used to attack the operating systems and applications it hosts with little or no possibility of detection. The SubVirt research project by University of Michigan and Microsoft uses virtual machine technology to install a rootkit to take control of multiple virtual machines. Finally, attacks using virtualization technology does not require hypervisor or virtual machine manager software at all. Technology present in today microprocessors that is utilized by hypervisors can also be utilized by malware, such as rootkits and viruses to take over a machine at the lowest level of control possible. “Security researcher Joanna Rutkowska presented a proof of concept attack known as ‘blue pill’ in 2006, that she said virtualized an operating system and was undetectable. […] Rutkowska and other have continued with such research, and this year she posited a new attack focusing on hypervisors.” (Bradbury, 2008). Virtualization is not a new to information technology. It dates back to over four decades to the early mainframes and large storage systems to protect and better utilize available computing resources. As this paper discussed virtualization technology, it detailed the kinds, benefits and security liabilities of the technology. Information about the nature of attacks against hosts and guests in a virtualized infrastructure was presented. New virtualization products for modern powerful servers and desktop hardware are helping satisfy the renewed interest in making better use of resources during tightening budgets. The benefits of this updated technology must be weighed against the challenges of securing and protecting the proliferation of virtual machines. Adaptation and transformation of policies and approach within IT organizations must be proactive to stay ahead of the disruptive change currently taking place with virtualization. References Bradbury, D. (2008). Virtually secure? Engineering & Technology. 8 November - 21 November, 2008. Pg. 54. Burt, J., Spooner, J. G. (2006). Virtualization edges toward PCs. eWeek. February 20, 2006. Pg. 24. Cardwell, T. (2007). Virtualization: an overview of the hottest technology that is changing the way we use computers. www.japaninc.com. November/December, 2007. Pg. 26. Duntemann, J. (2005). Inside the virtual machine. PC Magazine. September 20, 2005. Pg. 66. Edwards, J. (2009). Securing your virtualized environment. Computerworld. March 16, 2009. Pg. 26. Ferguson, S. (2007). IBM launches new virtualization tools. eWeek. February 12/19, 2007. Pg. 18. Peterson, J. (2007). Security rules have changed. Communications News. May, 2007. Pg. 18. PowerVM. (2009). IBM PowerVM: The virtualization platform for UNIX, Linux and IBM i clients. Retrieved July 25, 2009 from http://www-03.ibm.com/systems/power/software/virtualization/index.html. Sturdevant, C. (2008). Security in a virtualized world. eWeek. September 22, 2008. Pg. 35. Tiller, J. (2006). Virtual security: the new security tool? Information Systems Security. July/August, 2006. Pg. 2. Wong, W. (2005). Platforms strive for virtual security. Electronic Design. August 4, 2005. Pg. 44. Yager, T. (2006). Virtualization and security. Infoworld. November 20, 2006. Pg. 16. Vijayan, J. (2007). Virtualization increases IT security pressures. Computerworld. August 27, 2007. Pg. 14. ...

August 1, 2009 · 10 min · 2125 words · Jim Thario

Use of Cryptography in Securing Database Access and Content

This research paper explores the use of cryptography in database security. It specifically covers applications of encryption in authentication, transmission of data between client and server, and protection of stored content. This paper begins with an overview of encryption techniques, specifically symmetric and asymmetric encryption. It follows with a specific discussion about the use of cryptography in database solutions. The paper concludes with a short summary of commercial solutions intended for increasing the security of database content and client/server transactions. Whitfield Diffie, a cryptographic researcher and Sun Microsystems CSO says, “Cryptography is the most flexible way we know of protecting [data and] communications in channels that we don’t control.” (Carpenter, 2007). Cryptography is “the enciphering [encryption] and deciphering [decryption] of messages in secret code or cipher; the computerized encoding and decoding of information.” (CRYPTO, 2009). There are two primary means of encryption in use today. They are symmetric key encryption and asymmetric key encryption. Symmetric key encryption uses a single key to encrypt and decrypt information. Asymmetric key encryption, also known as public key cryptography uses two keys - one to encrypt information and a second key to decrypt information. In addition to encryption and decryption, public-key cryptography can be used to create and verify digital signatures of blocks of text or binary data without encrypting them. A digital signature is a small block of information cryptographically generated from content, like an email message or an installation program for software. The private key in the asymmetric solution can be used to create a digital signature of data, while the public key verifies the integrity of data and related digital signature that was created using the private key. The main advantage of public key cryptography over the symmetric key system is that the public key can be given away, as the name implies - made public. Anyone with a public key can encrypt a message and only the holder of the matching private key can decrypt that message. In the symmetric system, all parties must hold the same key. Public key cryptography can be used to verify the identity of an individual, application or computer system. As a simple example, let us say I have an asymmetric key pair and provide you with my public key. You can be a human or a software application. As long as I keep my private key protected so that no one else can obtain it, only I can generate a digital signature that you can use with my public key to prove mathematically that the signature only came from me. This approach is much more robust and less susceptible to attack than the traditional username and password approach. Application of cryptography does not come without the overhead of ongoing management of the technology. In a past interview (Carpenter, 2007), Whitfield Diffie, a co-inventor of public key cryptography says the main detractor from widespread adoption of strong encryption within I.T. infrastructures is key management - the small strings of data that keep encrypted data from being deciphered. Proper integration of cryptographic technologies into a database infrastructure can provide protection beyond username and password authentication and authorization. It can absolutely prevent anyone from reading sensitive data during transmission or stored on media. Some U.S. government standards require the use of encryption for stored and transmitted personal information. Grimes (2006) details the recent laws passed in the United States requiring the protection of personal data. These laws include the Gramm-Leach-Bliley Act for protection of consumer financial data, the Health Insurance Portability and Accountability Act for personal health-related data, and the Electronic Communications Privacy Act, which gives broad legal protection to electronically transmitted data. As discussed above, public key cryptography can be used to authenticate a person, application or computer using digital signature technology. A database management system enhanced to use public keys for authentication would store those keys and associate them with specific users. The client would use their private key to sign a small block of data that was randomly chosen by the server. The client would return a digital signature of that data, which the server could verify using the stored public keys of the various users. A verification match would identify the specific user. The second application of encryption technology in database security is used to protect transmission of data between a client and server. The client may be a web-based application running on a separate server and communicating over a local network, or it may be a fat-client located in another department or at some other location on the Internet. A technology called TLS can be used to provide confidentiality of all communications between the client and server, i.e. the database connection. “Transport Layer Security (TLS) and its predecessor, Secure Sockets Layer (SSL), are cryptographic protocols that provide security and data integrity for communications over networks such as the Internet.” (TLS, 2009). Web servers and browsers use the TLS protocol to protect data transmissions such as credit card numbers or other personal information. The technology can be used to protect any data transmission for any type of client-server solution, include database systems. TLS also has authentication capability using public key cryptography. This type of authentication would only allow known public keys to make a connection. This approach is not integrated at a higher level in the solution, such as the application level. Finally, cryptography can be used to protect the entire content of database storage, specific tables or columns of table data. Encrypting stored content can protect sensitive data from access within the database management system, through loss of the storage media, and an external process that reads raw data blocks from the media. The extent to which stored content is encrypted must be weighed against the overhead of encrypting and decrypting data for transaction-intense systems. Britt (2006) stresses the importance of selectively encrypting only those portions of the content that are evaluated to be a security risk if released into the public. He says a “[…] misconception is that adding encryption will put a tremendous strain on database performance during queries and loads.” This type of protection often uses symmetric key encryption because it is much faster than the public key solution. Marwitz (2008) describes several levels of database content encryption available in Microsoft SQL Server 2005 and 2008. SQL Server 2008 provides the ability to use public key authentication directly in the access control subsystem. Additionally, the entire database server storage, individual databases and table columns can be encrypted using public key encryption. (SQLS, 2009). Table columns, such as those used to store social security numbers, credit card number, or any other sensitive personal information are a good choice for performance sensitive systems. Use of this capability means that the only way to obtain access to the unencrypted data within a column of a database table protected in this manner is to use the private key of an individual who has been granted access. The user’s private key is used to authenticate and gain access to information in the database. Extra protection is gained since the private key is never co-located with the encrypted data. IBM’s DB2 product supports a number of different cryptographic capabilities and attempts to leverage as many of those capabilities that are present in the hosting operating system - Intel-based, minicomputer or mainframe. Authentication to the database from a client can be performed over a variety of encrypted connection types or using Kerberos key exchange. DB2 also supports the concept of authentication plug-ins that can be used with encrypted connections. After authentication has succeeded, DB2 can provide client-server data transmission over a TLS connection and optionally validate the connection using public key cryptography. Like Microsoft SQL Server, the most recent releases of DB2 can encrypt the entire storage area, single databases, or specific columns within the database. (DB2, 2009). This paper provided a broad survey of how cryptographic technologies can raise the security posture of database solutions. Cryptography is becoming a common tool to solve many problems of privacy and protection of sensitive information in growing warehouses of online personal information. This paper described the use of cryptography in database client authentication, transmission of transaction data, and protection of stored content. Two commercial products’ cryptographic capabilities were explored in the concluding discussion. There are more commercial, free and open source solutions for protecting database systems not mentioned in this paper. As citizens and government continue to place pressure on institutions to protect private information, expect to see the landscape of cryptographic technologies for database management systems expand. References Britt, P. (2006). The encryption code. Information Today. March 2006, vol. 23, issue 3. Carpenter, J. (2007). The grill: an interview with Whitfield Diffie. Computerworld. August 27, 2007. Page 24. CRYPTO. (2009). Definition of cryptography. Retrieved 18 July 2009 from http://www.merriam-webster.com/dictionary/cryptography. DB2. (2009). DB2 Security Model Overview. Retrieved 18 July 2009 from http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/topic/com.ibm.db2.luw.admin.sec.doc/doc/c0021804.html. Grimes, R. A. (2006). End-to-end encryption strategies. Infoworld. September 4, 2006. Page 31. Marwitz, C. (2008). Database encryption solutions: protect your databases - and your company - from attacks and leaks. SQL Server Magazine. September 2008. SQLS. (2009). Cryptography in SQL Server. Retrieved 18 July 2009 from http://technet.microsoft.com/en-us/library/cc837966.aspx. TLS. (2009). Transport layer security. Retrieved 18 July 2009 from http://en.wikipedia.org/wiki/Transport_Layer_Security. ...

July 22, 2009 · 8 min · 1533 words · Jim Thario

Application of Formal Methods in the Design of Reliable and Secure Software

This research paper explores the use of formal methods in software engineering to design reliable and secure software systems. Formal methods are mathematically focused languages or visual notations to specify behavior, algorithms or other types of program execution while remaining technology independent. This paper provides a brief overview of formal methods and several of the more popular implementations of formal methods in use today for software and systems development. It presents the benefits and drawbacks to formal methods, including reasons why formal methods are not commonplace for all software development. The precision of formal methods provides some opportunity for automation in the software development lifecycle including code generation and automated testing. An exploration of several problem domains where formal methods are often applied is provided. The paper concludes with discussion on the viability of formal methods as a continuing tool of software engineering. Hinchey (2008) defines formal methods as “[…] a specification notation with formal semantics, along with a deductive apparatus for reasoning, is used to specify, design, analyze, and ultimately implement a hardware or software (or hybrid) system.” Formal methods have a relationship to some of the earliest research in algorithms and automated computation. Pure mathematics and symbolic languages were the sole means of algorithmic expression before general-purpose software languages and microprocessors. One such early incarnation of a language for computation was the Turing machine, conceived by Alan Turing in 1936. Turing machines are “[…] simple abstract computational devices intended to help investigate the extent and limitations of what can be computed.” (TM, 2009). Before automated computation was truly possible, many scientific minds were working on ways to direct a computational machine in precise ways. Traditionally, formal methods are used in the specification and development of systems requiring high dependability, such as communication, flight control and life support. Something is dependable if its performance is constant. Reliability is the degree to which something is accurate, stable, and consistent. Security is a guarantee against loss or harm. Hanmer (2007) discusses the relationship between security and dependability, and the common quality attributes of the two when developing a system. He states that something is dependable if it exhibits reliability, maintainability, availability and integrity. Something is secure if it exhibits availability, integrity and confidentiality. The commonality between the two sets is availability and integrity. In the information technology world, the opposite of these two qualities are downtime and inconsistency - something we often see today resulting from informal software specification and lackluster development processes. As mentioned above, formal methods can be applied in the phases of specification, design, implementation or verification of software systems. There is potential use for formal methods throughout the entire development lifecycle. Requirements for software systems typically come from stakeholders in the domain in which the software is used, such as aerospace or finance. Those requirements are provided in human-readable form and need an initial transformation into a more precise language. Software designers can refine the formal specification through a series of iterations and deliver them to developers for implementation. The architecture, functionality and quality attributes of the software can be checked against the formal specifications during peer reviews with other designers and developers. Finally, the teams responsible for testing and verification of the system’s proper operation can use the formal specifications as scripts in developing test suites for automated or manual execution. The specifications from formal methods can be used for more than documentation of a system’s requirements and behavior. The precision in many formal methods allows the utilization of automation to reduce human error and increase consistency in the delivery of the final product. Translation of some or all of formal method languages into general-purpose computer source languages is possible, freeing the developers to concentrate on interesting refinements and optimization of the code, versus laboriously writing every line by hand. Stotts (2002) describes their project in which JUnit test cases were generated from formal method specifications. The automated approach enabled them “[…] to generate more test methods than a programmer would by following the basic JUnit practice, but our preliminary experiments show this extra work produces test suites that are more thorough and more effective at uncovering defects.” The formal methods research team at NASA Langley Research Center has developed a domain-specific formal method language called Abstract Plan Preparation Language. The research team focus and creation of the language is, “[…] to simplify the formal analysis and specification of planning problems that are intended for safety-critical applications such as power management or automated rendezvous in future manned spacecraft.” (Butler, 2006). There are economic disadvantages of applying formal methods in software development projects. Formal methods are typically more mathematically intensive than flowcharts or other modeling notations. They are also more precise and rigorous which result in more time spent expressing the solution using a formal method notation than a visual modeling language. A developer experienced in application-level design and implementation may have less education in computational mathematics required to work with formal method notation. A primary complaint from designers and developers is that the solution must be specified twice: once in the formal method notation and again in the software language. The same argument persists in the visual modeling community, which does embrace the use of model transformation to source code to reduce the duplication of effort. The availability of formal method transformation tools to generate source code helps eliminate this issue as a recurring reason not to use formal methods. Several formal methods are popular today, including Abstract State Machines, B-Method, Petri Nets and Z (zed) notation. Petri nets date back to 1939, Z was introduced in 1977, abstract state machines in the 1980s and B-Method is the most recent from the 1990s. Petri nets are found in the analysis of workflows, concurrency and process control. The Z formal method language is based on notations from axiomatic set theory, lambda calculus and first-order predicate logic. (Z, 2009). It was standardized by ISO in 2002. Abstract state machines resemble pseudo-code and are easy to translate into software languages. Several tools exist to verify and execute abstract state machine code, including CoreASM available on SourceForge.net. Finally, B-Method is a lower-level specification language with a wide range of tool support. It is popular in the European development community and has been used to develop safety systems for the Paris Metro rail line. (BMETH, 2009). The use of formal methods as a way of increasing software dependability and security remains strong in industries where even partial failure can result in unacceptable loss of money, time and most importantly, life. The choice of applying formal methods in a development project is often an economic, risk-based decision. There will continue to be application programs without the budget or convenience of time to add the extra process and labor required to transform requirements into formal method specifications and then into source code. However, the pattern of formal method use remains consistent in safety and security critical systems. The development and refinement of formal methods continues into this decade, most recently with the standardization of the Z method by ISO. The activity surrounding tooling and automation to support formal methods in during the development lifecycle appears to be growing. Perhaps the software industry is closing on a point of balance among formality in specification, time to market and automation in solution development. References ASM. (2009). Abstract State Machines. Retrieved 11 July 2009 from http://en.wikipedia.org/wiki/Abstract_State_Machines. BMETH. (2009). B-Method. Retrieved 11 July 2009 from http://en.wikipedia.org/wiki/B-Method. Butler, R. W. (2006). An Abstract Plan Preparation Language. NASA Langley Research Center, Hampton, Virginia. NASA/TM-2006-214518. Retrieved 11 July 2009 from http://shemesh.larc.nasa.gov/fm/papers/Butler-TM-2006-214518-Abstract-Plan.pdf. Hanmer, R. S., McBride, D. T., Mendiratta, V. B. (2007). Comparing Reliability and Security: Concepts, Requirements, and Techniques. Bell Labs Technical Journal 12(3), 65–78 (2007). Hinchey, M., Jackson, M., Cousot, P., Cook, B., Bowen, J. P., Margaria, T. (2008). Software Engineering and Formal Methods. Communications of the ACM, 51(9). September 2008. TM. (2009). Turing machine. Retrieved 11 July 2009 from http://plato.stanford.edu/entries/turing-machine/. Stotts, D., Lindsey, M., Antley, A. (2002). An Informal Formal Method for Systematic JUnit Test Case Generation. Technical Report TR02-012. Department of Computer Science, Univ. of North Carolina at Chapel Hill. Retrieved 11 July 2009 from http://rockfish.cs.unc.edu/pubs/TR02-012.pdf. Z. (2009). Z notation. Retrieved 11 July 2009 from http://en.wikipedia.org/wiki/Z_notation. ...

July 17, 2009 · 7 min · 1369 words · Jim Thario

Research Project Proposal: Model-Driven Information Repository Transformation and Migration

This project will apply Unified Modeling Language for the visual definition of data transformation rules for directing the execution of data migration from one or more source information repositories to a target information repository and will result in a UML profile optimized for defining data transformation and migration among repositories. I believe that a visual approach to specifying and maintaining the rules of data movement between the source and target repositories will decrease the time required to define these rules, enable less technical individuals to adopt, and provide a motivation to reuse these models to accelerate future migration and consolidation efforts.Problem Statement and BackgroundMy role in this project includes project planning and task management, primary researcher and developer of the deliverables of the project. My technical background includes being a certified OOAD designer in Unified Modeling Language by IBM and a software engineer for nearly two decades. I recently have been involved in the migration of several custom knowledge data repositories to an installation of IBM Rational Asset Manager.This project will use a constructive ontology and epistemology to create a new solution in the problem space of the project. This is the most appropriate research ontology and epistemology because there is little precedence available in the exactly this area of research. Visually modeling program specifications have been studied in other problem domains and continue to be an area of interest. This particular problem space is unique, relatively untouched, and in an area of considerable interest to me. A possible constraint of the project includes shortcomings of the UML metamodel rules to allow the extension and definition of an effective rules-based data transformation and migration language. A second constraint of the project may be identification of one or more source repositories as candidates for moving to a new system. For the second constraint, one or more simulated repositories may need to be created.This study is relevant to software engineering practitioners, information technology professionals, database administrators and enterprise architects who wish to consolidate data repositories to a single instance. Unified Modeling Language (UML) is primarily used today in information technology to visually specify requirements, architectures and designs of systems, to verify and create test scenarios, and to perform code generation. The UML metamodel was designed to make the language extensible, with the ability to support profiles that allow the language to be customized to support specific problem domains. Researchers and practitioners are finding innovative uses for UML as a visual specification language. Zulkernine, Graves, Umair and Khan (2007) recently published their results in using UML to visually specify rules for a network intrusion detection system. Devos and Steegmans (2005) also published their results in using Unified Modeling Language in tandem with Object Constraint Language to specify business process rules with validation and error checking.This project will contribute to at least two fields of information technology: visual modeling languages, and information consolidation and management. This project will make a unique contribution to the subject area of domain-specific visual languages for the definition of rules. Additionally, a successful outcome from this project will contribute to knowledge in the area of lowering complexity of consolidating repositories to save operations costs and increase modernization of data access systems. An opposing approach to this project would be a federated solution to data consolidation. A federated solution would continue to maintain multiple data repositories and connect their operations via programming interfaces so that clients could access them and combine their data to create the appearance of a unified source.The project area of focus was motivated by my desire to create a visual system for complete migration of a source repository of technical data, such as a technical support knowledge base, to a new product called Rational Asset Manager. My overall goal was to drive the entire migration visually using a single model specification. This specification would visually specify the rules in migrating and transforming data from one system to another as well as visually select the technical mechanisms used to communicate with each information repository, such as SQL databases, web services, XML translation, etc. In addition, I wanted to generate some executable code from the models that would carry out some or all of the movement of data between repositories. In scaling this broad problem area down, I decided to focus on using the model as a specification that would be read by an existing program to carry out the instructions in the model. This program already exists, but does not yet know how to read models. Finally, in focusing on a specific part of the visual specification, I decided to focus on an aspect of the model that locates data from one system, potentially re-maps it or transforms it, and places it into the target system. The final initial research focus would take the form of a UML profile that could be used to specify this aspect of the solution and extend the existing migration program to use the model to perform its work.Project Approach and MethodologyThis project will use a design science methodology to iteratively create, test, and refine the deliverables of the project’s outcome. The design science methodology defines five process steps in achieving the outcome of a research project: awareness of problem, suggestion, development, evaluation, and conclusion. This project is currently at the awareness of the problem phase. The inputs to this phase have been my experiences in working within the problem space for the last several years and the secondary research into the problem area performed thus far. I have encountered shortcomings in automation to help accelerate solutions in this problem space. At the same time, I have observed closely related problems overcome using visual and declarative technologies. Additional secondary research is being conducted to understand the body of knowledge associated with this area of visual modeling. The output at this phase is this proposal for a project to develop a visual language to help accelerate solutions in this problem space. Significant elements of the proposal include the overall vision of the project, the risks of the project, tools and resources required to carry out the project, and the initial schedule to complete the project. Following an accepted proposal, the next phase of this methodology is the suggestion phase, which involves a detailed analysis and design of the proposed solution. During the suggestion phase, several project artifacts will be created and updated with new information. Updated artifacts include the project risks and a refined schedule for completion of the project. New artifacts produced at this phase include early UML and migration tool prototypes to explore various technical alternatives, detailed test and validation plans, and most importantly the design plans for the following phase of the project. A significant activity performed at this phase is the acquisition and readiness of the project resources, such as physical labs, input test data from candidate repositories, access to networked systems to acquire the test data, and installation of hardware and software tools.The development phase of the project uses the design plans established in the suggestion phase to focus on construction of the first iteration of the solution. Experiences during this phase also drive refinements to the project schedule, detailed test and validation plans, risks, and the design plan of the solution. The deliverable of this phase is the first generation of the UML profile and extensions to the existing migration tool to support parsing and using models created with the profile. The test specification models are used to move a larger portion of the candidate source repositories to the target repository. After conclusion of this phase, the project may return to an earlier phase to refine plans or project scope based on what is learned during the development of the solution. If acceptable progress is demonstrated at the conclusion of this phase, the project will continue to the evaluation phase.The evaluation phase focuses most of its effort on formal testing and validation of the solution produced in the development phase. The evaluation of the work against the thesis includes working with specific individuals to determine if this is indeed an approach that will save time and simplify the specification of data migration and transformation rules. Documentation of the testing outcome and comparison to the anticipated outcome may cause the project to return to an earlier phase to adjust scope or expectations. If it is decided the project has met its goals, or the goals are not achievable by the project’s approach, the effort will conclude.The conclusion of this project will involve final documentation of the outcome and packaging of all the project’s artifacts for future research studies. The project’s artifact package will be placed in the public location for others to review and use.As mentioned above, this project will require several physical resources and cooperation from technical experts. The study will require access to two or more legacy data repositories as sources for information. The source repositories should ideally utilize different underlying database technologies and implement different information schemas to test variations of the proposed modeling language as it is developed and tested. Access to the technical administrators of the source repositories will be necessary to understand the repositories’ schema and obtain read-only access or a copy of their information. It would be preferred that the repositories be accessed read-only and utilized via a network, or the information is relocated to a computing system directly available to the research project. The study will require at least one server system running IBM’s Rational Asset Manager. This system will act as the target data repository. Data transformed from the source repositories will migrate into Rational Asset Manager, driven by a migration application that uses the visual specifications as direction. The study will also require a single workstation with IBM Rational Software Architect for development of the visual modeling language and extension of the existing migration programs to read the visual models and perform the migration work from the source to target repositories.A requirement of the project’s determination of success is the need to measure the savings in the time to build a migration solution with and without visual specifications. The migration problems need to be varied as well, from simple one-to-one mappings from a single source repository to a single target repository, to more exotic migration scenarios, such as consolidating multiple source repositories to a single target repository and re-mapping values from the source to the target. Additionally, the reusability of previous solutions will be measured as well. This aspect of the project’s outcome will quantify how easily a specification model can be reused from a previous solution.Definition of the End Product of ProjectThis project will produce several artifacts during the project’s life and at conclusion. Most importantly, a UML profile will be developed that can be imported into Rational Software Architect or Rational Software Modeler. The profile will include usage documentation and example models that demonstrate various types of rules that may be specified in a visual model and how that model is read and executed by the migration program. The migration program will be a reference-implementation of an existing tool program that can read the model configured with the UML profile and generates events for extension points on which to act.In addition to technical deliverables, all project planning and process artifacts, such as the project plan, design plan, risks and mitigation notes, test criteria and test result data will be made available. The project will conclude with the development of at least one article or paper for submission to a research journal to document this project’s challenges and achievements, and an annotated bibliography of secondary research related to the project will be provided.If successful, this project will contribute to simplifying part of the process of developing a migration solution without having to recreate the existing tool used today. The project will add a new component to the migration tool and consumers of the tool can choose to use this new component. An assumption made in this research project is that the UML profile developed as a deliverable will be an approachable alternative for less experienced IT professionals and software engineers. This will be a challenge for the project’s results.ReferencesDevos, F., Steegmans, E. (2005). Specifying business rules in object-oriented analysis. Softw Syst Model (2005) 4: 297–309 / Digital Object Identifier (DOI) 10.1007/s10270-004-0064-z.Zulkernine, M., Graves, M., Umair, M., Khan, A. (2007). Integrating software specifications into intrusion detection. Int. J. Inf. Secur. (2007) 6:345–357. DOI 10.1007/s10207-007-0023-0. ...

December 10, 2008 · 10 min · 2066 words · Jim Thario

Reducing Adoption Barriers of Agile Development Processes in Large andGeographically Distributed Organizations

Agile software development processes have received much attention from the software development industry in the last decade. The goal of agile processes is to focus the importance of people as primary contributors of the project and reduce the administrative overhead of producing working code for the stakeholders of the project. This paper explores some of the explicit and implied constraints of agile software development processes. It focuses on several common practices of agile processes, particularly those that might limit their adoption by large and geographically distributed organizations. This paper makes recommendations to reduce the barriers to adoption of agile processes by these types of organizations. It attempts to answer questions such as: Is it possible for a large organization with many established business and development processes to incrementally adopt an agile process? Is it possible to adopt agile development processes to work for many individuals who are physically isolated, such as work-at-home software developers? Is it possible to adopt agile development processes to work for a large team, divided into many sub-teams that are geographically distributed and possibly working in different time zones? Extreme Programming is probably one of the most recognized agile software development process today. It was introduced in the late 1990s by Kent Beck and eventually published as a book (Beck, 2005). Beck’s approach documented the values, principles and practices necessary too deliver lower defect, working software with less formal process and more focus on the skills of people and community that produces it. Extreme Programming is targeted to small, collocated teams of about twelve people. Other proponents of agile software development processes understood the increasing interest in their approaches by the software industry and followed with the Manifesto for Agile Software Development. The contributors of the Manifesto were the creators of many different agile, iterative and incremental software development processes. Their goal was to unify principles they shared in common. The work was authored by “[…] representatives from Extreme Programming, SCRUM, DSDM, Adaptive Software Development, Crystal, Feature-Driven Development, Pragmatic Programming, and others […]” (Manifesto, 2001). Beck and Andres (2005) present the primary practices of Extreme Programming in their book. Two practices stand out as a limitation of scaling Extreme Programming to teams in multiple locations, or even work-at-home employees. They are Sit Together and Pair Programming. Sit Together is a practice that encourages the team to work in a unified area, such as a large, open room that promotes easy communication. Pair Programming is a technique where two developers sit together at a single workstation and take turns designing and writing code. As one developer is writing code, the other is observing, asking questions and offering suggestions as the current piece of work progresses. The goal of these two practices is to lower the defect rate through a constantly available communication and collaboration of developers sharing the same physical space. Beck and Andres (2005) also discuss the importance of team size in a project that uses Extreme Programming. They recommend a team size of about twelve people. The reason for this size has as much to do with coordination of development activities as it does with psychological needs of being a part of a team. The larger a team grows, the less personal the connections between team members become. Faces are more difficult to remember and communication among all members gravitates toward infrequency. These challenges with team size become amplified with work-at-home software developers who may only be in the physical presence of other members of the team a few times a year at specific events such as all-hands meetings. Active and regular communication is a requirement with agile software development. Ramesh (2006) describes the perceived advantages of teams distributed across time zones and continuous development, e.g. as one team ends for the day and goes to bed another is coming to work to pick up where the last left off. However, there is actually a communication disconnection between the geographically distributed team in this situation, and the teams are forced into a mode of asynchronous communication, potentially slowing down progress. This problem relates to two principles of the Manifesto for Agile Software Development (2001) that presents a challenge to geographically distributed teams. The first is “Business people and developers must work together daily throughout the project.” The second is “The most efficient and effective method of conveying information to and within a development team is face-to-face conversation.” Both principles are related to communication among developers, management, stakeholders and users of the project. Lindvall (2004) points out that incremental adoption of agile practices into an existing large organization can be challenging. An existing organization typically has the expectation that existing business and development processes are followed regardless of project size and the process used. Educating those outside of the agile pilot project and resetting their expectations for following the established processes can create tension. A specific example is that development in agile-driven projects usually starts with a subset of the requirements set. This is a quality of agile development processes and has to do with working on what is understood to be the goal of the project today. As working builds are created and delivered to stakeholders the requirements set can be appended and refined until there is agreement that a reasonable goal has been established. Murthi (2002) documents a case of using Extreme Programming on a 50-person development project and cites the ease of starting early with a partial requirements set, and using the subsequent working results for two goals: show stakeholders working software to build confidence in the development team and giving stakeholders something to help refine their own understanding of their needs. Incrementally developed requirements, constantly refined budgeting and burn rate of finances that are typical of agile development process management can present a unique challenge to a project that is completely or partially outsourced. Cusumano (2008) details the need for an iterative contract between the customer and outsourcing provider. A fixed-price contract can be nearly impossible to design when agile development processes are in use by either party. Boehm (2005) also discusses the problem of using agile processes within the realm of contracting to the private and public sector. Problems can be encountered when measuring progress of a contract’s completion. As a consumer following an agile process, the requirements can remain a moving target well into the project’s life cycle. As a provider following agile development processes, it can become nearly impossible to provide final system architectural details early in the life cycle to the consumer for review. Boehm also points out the difficulties to overcome by providers utilizing agile processes when seeking certification for CMMI and ISO-related international standards. The barriers of agile development process adoption by a large or geographically distributed organization can be reduced by a combination of two approaches. The first approach is the application of tooling and technologies to support the practices of agile software development that scale to an organization’s needs. The second approach is to continuously refine the practices in conflict with the organization’s existing mode of operation over time. An example of practice refinement through technology adoption is The Sit Together and Pair Programming practices from Extreme Programming, and working together daily and face-to-face interaction among customers and developers as recommended principles of the Manifesto of Agile Software Development. These practices and principles are the most obvious barrier to adoption of pure agile development processes within a large or geographically distributed team. The essence of the Sit Together practice is to provide a means to communicate at-will among team members. Technologies that help support this practice in distributed environments include instant messaging systems that can provide a mechanism for short question and answer sessions for two or more participants in the project at once. Longer conversations among the team can be supported through VOIP solutions, reservation-less teleconference solutions, Skype and XMPP-based messaging solutions that can allow several team members at a time impromptu contact and discussion opportunities for project issues. Speakerphones allow collocated sub-teams to participate in conversations about the project across geographic locations. In all the examples cited, full-duplex voice communication is essential for effective discussion among several team members at once. This type of communication allows the audio channels to work in both directions simultaneously, which means someone can talk and interrupt someone speaking as they could when they are in person. Many inexpensive speakerphones are half duplex. These types of devices block the receiving audio channel when the person is speaking. Someone wanting to stop the speaker to clarify a point is unable to do so until the person speaking pauses. Background noise, such as a loud computer fan or air conditioner can cause similar problems for half-duplex communication systems. Pair Programming can be performed through a combination of voice communication and desktop screen sharing technology. Individuals working within the same network or virtual private network can use solutions like Microsoft NetMeeting or Virtual Network Computing (VNC) to share, view and work within each other’s development environment and perform pair programming over any distance. Web-based and wide-area-network tooling to support the incremental development and tracking of plans, requirements and defects is available from several vendors such as IBM and Rally Software Development Corporation. Gamma (2005) presented The Eclipse Way at EclipseCon several years ago. The motivation behind his presentation was the many requests he received from users of the Eclipse environment to understand how a team distributed throughout the world could continue to release as planned and with a low defect rate. The Eclipse Foundation has a centralized data center in Canada for several of its activities including continuous integration and automated testing of nightly builds. The build and testing process of the Eclipse environment is fully automated for each platform it supports. Additionally, end-users are encouraged to install and use nightly builds after they pass the automated suite of tests. Other barriers to adopting agile development processes cannot be solved with tooling alone. Ramesh (2006) found that the solution to working across multiple time zones is to synchronize some meetings, and rotate the time of the meeting so that each group takes turns in suffering from an extraordinarily early or late meeting so that everyone on the project can communicate live. Solving the opposing forces in contract negotiating requires creativity. Boehm (2005) recommends disbursing “[…] payments upon delivery of working running software or demonstration of progress rather than completion of artifacts or reviews.” According to Boehm there is not yet a well-defined compatible solution to agility in process and certification of ISO or CMMI related certifications. Lindvall (2004) concluded that adoption of agile development processes by large organizations is best accomplished through hybrid integration with the existing processes, particularly the established quality processes. With this approach, the existing quality processes can be used to measure the effectiveness of the agile software development process under pilot. This paper described several of the qualities shared by different agile software development processes. It focused on those aspects that potentially limit agile process adoption by large and geographically distributed organizations. The recommendations made in this paper include technology solutions to improve collaboration and communication among distributed developers and consumers of the project. The technology considerations also help alleviate management concerns such as incremental planning and budgeting of agile projects. Recommendations were also provided for large organizations with established processes and approaches pilot projects utilizing agile development can take to leverage those processes to demonstrate their value. It is possible to adopt agile software development processes for large and geographically distributed organizations. Adoption requires thoughtful and careful application, integration and refinement of the practices at the core of these agile processes for a successful outcome. REFERENCES Beck, K., Andres, C. (2005). Extreme Programming Explained. Second Edition. Copyright 2005, Pearson Education, Inc. Boehm, B., Turner, R. (2005). Management Challenges to Implementing Agile Processes in Traditional Development Organizations. IEEE Software. 0740-7459/05. Cusumano, M.A. (2008). Managing Software Development in Globally Distributed Teams. Communications of the ACM. February 2008/Vol. 51, No. 2. Gamma, E., Wiegand, J. (2005). Presentation: The Eclipse Way, Processes That Adapt. EclipseCon 2005. Copyright 2005 by International Business Machines. Leffingwell, D. (2007). Scaling Software Agility: Best Practices for Large Enterprises. Copyright 2007 by Pearson Education, Inc. Lindvall, M., Muthig, D., Dagnino, A., Wallin, C., Stupperich, M., Kiefer, D., May, J., Kahkonen, T. (2004). IEEE Computer. 0018-9162/04. Manifesto. (2001). Manifesto for Agile Software Development. Retrieved 2 October 2008 from http://agilemanifesto.org/. Murthi, S. (2002). Scaling Agile Methods - Can Extreme Programming Work for Large Projects? www.newarchitectmag.com. October 2002. Ramesh, B., Cao, L., Mohan, K., Xu, P. (2006). Can Distributed Software Development Be Agile? Communications of the ACM. October 2006/Vol. 49, No. 10. ...

October 12, 2008 · 10 min · 2104 words · Jim Thario

Applicability of DoDAF in Documenting Business Enterprise Architectures

As of 2005, the Department of Defense employed over 3 million uniformed and civilian people and it had a combined $400 billion fiscal budget (Coffee, 2005). The war-fighting arm of the government has had enormous buying power since the cold war and the complexity of technologies used in military situations continues to increase. To make the most optimal use of its dollars spent, reduce rework and delays in delivery of complex solutions, the DoD needed to standardize the way providers described and documented their systems. The DoD also needed to promote and enhance the reuse of existing, proven architectures for new solutions. The Department of Defense Architecture Framework (DoDAF) is used to document architectures of systems used within the branches of the Department of Defense. “The DoDAF provides the guidance and rules for developing, representing, and understanding architectures based on a common denominator across DoD, Joint, and multinational boundaries.” (DODAF1, 2007).DoDAF has roots in other enterprise architecture frameworks such as Zachman Framework for Information System Architecture (Zachman, 1987) and Scott Bernard’s EA-Cubed framework described in (Bernard, 2005). Zachman and Bernard’s architecture frameworks have been largely adopted by business organizations to document IT architectures and corporate information enterprises. Private sector businesses supplying solutions to the DoD must use the DoDAF to document the architectures of those systems. These suppliers may not be applying concepts of enterprise architecture to their own business, or they may be applying a different framework internally with an established history of use in the business IT sector. The rigor defined in DoDAF version 1.5 is intended for documenting war fighting and business architectures within the Department of Defense. The comprehensive nature of DoDAF including the required views, strategic guidance, and data exchange format also makes it applicable to business environments. For those organizations in the private sector that must use the DoDAF to document their deliverables to the DoD, it makes sense to approach adoption of DoDAF in a holistic manner and extend the use of DoDAF into their own organization if they intend to adopt any enterprise architecture framework for this purpose.The Department of Defense Architecture Framework is the successor to C4ISR. “The Command, Control, Communications, Computers, and Intelligence, Surveillance, and Reconnaissance (C4ISR) Architecture Framework v1.0 was created in response to the passage of the Clinger-Cohen Act and addressed in the 1995 Deputy Secretary of Defense directive that a DoD-wide effort be undertaken to define and develop a better means and process for ensuring that C4ISR capabilities were interoperable and met the needs of the war fighter.” (DODAF1, 2007). In October 2003, DoDAF Version 1.0 was released and replaced the C4ISR framework. Version 1.5 of DoDAF was released in April of 2007. DoDAF solves several problems with the acquisition and ongoing operations of branches within the Department of Defense. Primarily it serves to reduce the amount of misinterpretation in both directions of communication by system suppliers outside of the DoD and consumers within the DoD. The DoDAF defines a common language in the form of architectural views for evaluating the same solution from multiple vendors. The framework is regularly refined through committee and supports the notion of top-down architecture that is driven from a conceptual viewpoint down to the technical implementation.Version 1.5 of DoDAF includes transitional improvements to support the DoD’s Net-Centric vision. “[Net-Centric Warfare] focuses on generating combat power from the effective linking or networking of the war fighting enterprise, and making essential information available to authenticated, authorized users when and where they need it.” (DODAF1, 2007). The Net-Centric Warfare initiative defines simple guidance within DoDAF 1.5 to support the vision of the initiative and guide qualities of the architecture under proposal. The guidance provided within DoDAF includes a shift toward a Services-Oriented Architecture, which we often read about in relationship to the business sector. It also encourages architectures to accommodate unexpected but authorized users of the system. This is related to scaling the solution and loose coupling of system components used in communication of data. Finally, the Net-Centric guidance encourages the use of open standards and protocols such as established vocabularies, taxonomies of data, and data interchange standards. These capabilities will help promote integrating systems into larger, more information intensive solutions. As this paper is written, Version 2.0 of DoDAF is being developed. There is currently no timeline defined for release.DoDAF defines a layered set of views of a system architecture. The view progress from conceptual to technical. Additionally a standards view containing process, technical, and quality requirements constrain the system being described. The topmost level of view is the All Views. This view contains the AV-1 product description and the AV-2 integrated dictionary. AV-1 can be thought of as the executive summary of the system’s architecture. It is the strategic plan that defines the problem space and vision for the solution. The AV-2 is the project glossary. It is refined throughout the life of the system as terminology is enhanced or expanded. The next level of view is the Operational Views. This level can be thought of as the business and data layer of the DoDAF framework. The artifacts captured within this view include process descriptions, data models, state transition diagrams of significant elements, and inter-component dependencies. Data interchange requirements and capabilities are defined within this view. Example artifacts from the operational view include the High-Level Operational Concept Graphic (OV-1), Operational Node Connectivity Description (OV-2), and Operational Activity Model (OV-5). The third level of views of Systems and Services View. This view describes technical communications and data interchange capabilities. This level of the architecture is where network services (SOA) are documented. Physical technical aspects of the system are described in this level as well, including those components of the system that have a geographical requirement. Some artifacts from the Systems and Services View include Systems/Services Interface Description (SV-1), Systems/Services Communications Description (SV-2), Systems/Services Data Exchange Matrix (SV-6), and Physical Schema (SV-11).DoDAF shares many of the beneficial qualities of other IT and enterprise architecture frameworks. A unique strength of DoDAF is the requirement of a glossary as a top-level artifact in describing the architecture of a system. (RATL1, 2006). Almost in tandem with trends in the business IT environment toward Service-Oriented Architectures, DoDAF 1.5 has shifted more focus to a data-centric approach and network presence in the Net-Centric Warfare initiative. This shift is motivated by the need to share operational information with internal and external participants who are actors in the system. This need is also motivated by the desire to assemble and reuse larger systems-level components to build more complex war fighting solutions. As with other frameworks, DoDAF’s primary strength is in the prescription of a common set of views to compare capabilities of similar systems. The views enable objective comparisons between two different systems that intend to provide the same solution. The views enable faster understanding and integration of systems delivered from provider to consumer. The view also allows for cataloging and assembling potentially compatible systems into new solutions perhaps unforeseen by the original provider. The DoDAF view can effect a reduction of deployment costs and lower possibility of reinventing the same system due to lack of awareness about existing solutions. A final unique strength of DoDAF is that it defines a format for data exchange between repositories and tools used in manipulating the architectural artifacts. The (DODAF2, 2007) specification defines with each view the data interchange requirements and format to be used when exporting the data into the common format. This inclusion in the framework supports the other strengths, most importantly automation of discovery and reuse of existing architectures.Some weaknesses of DoDAF can be found when it is applied outside of its intended domain. Foremost, DoDAF was not designed as a holistic, all encompassing enterprise architecture framework. DoDAF does not capture the business and technical architecture of the entire Department of Defense. Instead it captures the architectures of systems (process and technical) that support the operations and strategy of the DoD. This means there may be yet another level of enterprise view that relates the many DoDAF-documented systems within the DoD into a unified view of participating components. This is not a permanent limitation of the DoDAF itself, but a choice of initial direction and maximum impact in the early stages of its maturity. The focus of DoDAF today is to document architectures of complex systems that participate in the overall wartime and business operations of the Department of Defense. A final weakness of DoDAF is the lack of business-financial artifacts such as a business plan, investment plan and return-on-investment plan.It is the author’s observation that the learning curve for Zachman is potentially smaller than DoDAF. Zachman’s basic IS architecture framework method is captured in a single paper of less than 30 pages, while the DoDAF specification spans several volumes and exceeds 300 pages. Zachman’s concept of a two-dimensional grid with cells for specific subjects of documentation and models is easier for an introduction to enterprise architecture. It has historically been developed and applied in business information technology situations. Zachman’s experience in sales and marketing at IBM motivated him to develop a standardized IS documentation method. There are more commonalities than differences in the artifacts used in both DoDAF and Zachman methods. Zachman does not explicitly recommend a Concept of Operations Scenario, which is an abstract flow of events, a cartoon board, or artistic rendering of the problem space and desired outcome. This does not mean a CONOPS (Bernard, 2005) view could not be developed for a Zachman documentation effort. Business process modeling, use-case modeling, and state transition modeling are all part of DoDAF, Zachman, and Bernard’s EA-cubed frameworks. (Bernard, 2005).The EA-cubed framework developed by Scott A. Bernard was heavily influenced by Zachman’s Framework for Information Systems Architecture. Bernard scaled the grid idea to support enterprise architecture for multiple lines of business with more detail than was possible with a two-dimensional grid. The EA-cubed framework uses a grid similar to Zachman’s with an additional dimension of depth. The extra dimension allows each line of business within the enterprise to have its own two-dimensional grid to document their business and IT architecture. Cross-cutting through the cube allow architects to identify potentially common components to all lines of business - a way to optimize cost and reduce redundant business processes and IT systems. The EA-cubed framework includes business-oriented artifacts for the business plan, investment case, ROI, and product impact of architecture development. As mentioned above, DoDAF does not include many business-specific artifacts, specifically those dealing with financials. Both Zachman and EA-cubed have more layers and recommended artifacts than DoDAF. EA-cubed has specific artifacts for physical network level and security crosscutting components, as an example. The Systems and Services view of DoDAF recommends a Physical Schema artifact to capture this information if needed. In the case of DoDAF, vendors may not know in advance the physical communication medium deployed with their system such as satellite, microwave or wired networks. In these cases, the Net-Centric Warfare guidance within DoDAF encourages the support of open protocols and data representation standards.DoDAF is not a good starting point for beginners to enterprise architecture concepts. The bulk of the volumes of the specification can be intimidating to digest and understand without clear examples and case studies to reference. Searching for material on Zachman on the Internet produces volumes of information, case studies, extensions and tutorials on the topic. DoDAF was not designed as a business enterprise architecture framework. The forces driving its development include standardizing documentation of systems proposed or acquired through vendors, enabling reuse of existing, proven architectures, and reduce time to deploy systems-of-systems built from cataloged systems already available. Many of the documentation artifacts that Zachman and EA-cubed include in their frameworks are also prescribed in DoDAF, with different formal names but essentially the same semantics. The framework recommends more conceptual-level artifacts than Zachman. This could be attributed to the number of stakeholders involved in deciding if a solution meets the need. DoDAF includes a requirement for glossary and provides architectural guidance with each view based on current DoD strategy. Much of the guidance provided in DoDAF is directly applicable to the business world. The Net-Centric Warfare strategy, which is discussed in within the guidance, is similar to the Service-Oriented Architecture shift happening now in the private sector. Lack of business-strategic artifacts such as business plan, investment plan, and ROI estimates would force an organization to supplement prescribed DoDAF artifacts with several of their own or from another framework. The Department of Defense Architecture Framework was designed to assist in the acquisition of systems from suppliers. There are many point-in-time similarities between Zachman and DoDAF in terms of DoDAF’s level of refinement for use with large enterprises. DoDAF could potentially benefit from a similar approach as Bernard’s, in that the flat tabular view is scaled up with depth. A extension of DoDAF with a third dimension could be used to document the architectures of multiple lines of business within an enterprise with more detail than is possible with a single artifact set. With minor enhancements, the DoDAF is a viable candidate for business enterprise architecture efforts. ReferencesArmour, F.J., Kaisler, S.H., Liu, S.Y. (1999). A Big-Picture Look at Enterprise Architectures, IT Professional, vol. 1, no. 1, pp. 35-42. Retrieved from http://doi.ieeecomputersociety.org/10.1109/6294.774792.Bernard, S.A. (2005). An introduction to enterprise architecture. (2nd ed.) Bloomington, IN: Author House.Coffee, P. (2005). Mastering DODAF will reap dividends. eWeek, 22(1), 38-39. Retrieved August 3, 2008, from Academic Search Premier database.Dizard, W. P. (2007). Taking a cue from Britain: Pentagon’s tweaked data architecture adds views covering acquisition, strategy. Government Computer News, 26, 11. p.14(1). Retrieved August 02, 2008, from Academic OneFile via Gale: http://find.galegroup.com.dml.regis.edu/itx/start.do?prodId=AONEDoDAF1. (2007). DoD Architecture Framework Version 1.5. Volume I: Definitions and Guidelines. Retrieved 31 July 2008 from http://www.defenselink.mil/cio-nii/docs/DoDAF_Volume_I.pdf.DoDAF2. (2007). DoD Architecture Framework Version 1.5. Volume II: Product Descriptions. Retrieved 31 July 2008 from http://www.defenselink.mil/cio-nii/docs/DoDAF_Volume_II.pdf.IBM. (2006). An IBM Rational Approach to the Department of Defense Architecture Framework (DoDAF). Retrieved 2 August 2008 from ftp://ftp.software.ibm.com/software/rational/web/whitepapers/G507-1903-00_v5_LoRes.pdf.Leist, S., Zellner, G. (2006). Evaluation of current architecture frameworks. In Proceedings of the 2006 ACM Symposium on Applied Computing (Dijon, France, April 23 - 27, 2006). SAC ‘06. ACM, New York, NY, 1546-1553. DOI= http://doi.acm.org/10.1145/1141277.1141635.RATL1 (2006). An IBM Rational approach to the Department of Defense Architecture Framework (DoDAF) Part 1: Operational view. Retrieved 1 August 2008 from http://www.ibm.com/developerworks/rational/library/mar06/widney/.RATL2 (2006). An IBM Rational approach to the Department of Defense Architecture Framework (DoDAF) – Part 2: Systems View. Retrieved 1 August 2008 from http://www.ibm.com/developerworks/rational/library/apr06/widney/.Zachman, J.A. (1987). A framework for information systems architecture. IBM Systems Journal, Vol. 26, No. 3, 1987. Retrieved July 2008 from http://www.research.ibm.com/journal/sj/263/ibmsj2603E.pdf. ...

August 9, 2008 · 12 min · 2423 words · Jim Thario

Issues of Data Privacy in Overseas Outsourcing Arrangements

Outsourcing is a business concept that has been receiving much attention in the new millennium. According to Dictionary.com (2008) the term outsourcing means to obtain goods or services from an outside source. The process of outsourcing a portion of a business’ work or material needs to an outside provider or subcontractor has been occurring for a long time. The information technology industry and outsourcing have been the focus of editorials and commentaries regarding the movement of technical jobs from the United States to overseas providers. The globalization of business through expanding voice and data communication has forged new international partnerships and has increased the amount of outsourcing happening today. Businesses in the U.S and Europe spend billions in outsourcing agreements with overseas service providers. According to Sharma (2008), spending for outsourcing in the European Union is almost $150 billion (GBP) in 2008. The overriding goal in outsourcing work to a local or overseas provider is to reduce the operations cost for a particular part of the business. Many countries, such as India and China have lower wages and businesses in the U.S. and Europe can save money by hiring an overseas contractor to perform a portion of their work. Outsourcing is gaining popularity in the information age by assisting information technology companies in performing some of their business tasks. This can include data processing, and call routing and handling. With the growth of the technology industry also comes the problem of maintaining and protecting private information about the details of individuals, such as medical history or financial data. Many countries such as the United States and Europe have mandatory personal data privacy laws. These laws do not automatically translate to national laws where the outsourcing service provider is located, or potentially the service provider’s subcontractors. This paper discusses the issues of outsourcing work to an overseas provider when personal data is involved in the outsourced tasks. It presents several solutions to help manage the risk of data breaches caused by disparate laws in countries currently popular for information technology outsourcing. The most common types of work outsourced to overseas service providers include bulk data processing, call center handling, and also paralegal outsourcing. The last example of overseas outsourcing can include work such as legal research, contract and brief writing, and transcription. Outsourcing firms typically do not have a U.S. law license, which limits the extent of their involvement in legal work. The United States is expanding national information protection laws. Two of the most common laws are the Health Insurance Portability and Accountability Act (HIPAA) and the Gramm-Leach-Bliley Act (GLB). The U.S. Congress enacted the HIPAA act in 1996. It is related to the protection of health information that can be used to identify someone or disclose a medical condition. “The data privacy and security requirements of HIPAA apply to health plans, health care providers, and health care clearinghouses. Among its many requirements, HIPAA mandates the creation and distribution of privacy policies that explain how all individually identifiable health information is collected, used, and shared.” (Klosek, 2005). U.S. Congress enacted the GLB act in 1999. The Financial Privacy Rule of the Act is related to documenting and auditing the processes used by an organization for assuring privacy of information that can identify persons, as in HIPAA, and private data about their finances. Both HIPAA and GLB require the organization to publish the information privacy policy and notify the consumer each time it changes. “[…] The GLB Act focuses upon privacy of the non-public information of individuals who are customers of financial institutions.” (Klosek, 2005). The U.S. is not considered at the forefront of privacy protection laws. Likewise, many countries have absolutely no privacy protection laws for their citizens. The European Union is one of the strictest regions with respect to data privacy and outsourcing work that handles private information. The privacy directive for the entire EU was passed in 1998. It specifies a minimum standard for all member countries to follow in handling private personal data and transferring it between companies inside and outside of the European Union. “The EU privacy directive 1998 aims to protect the privacy of citizens when their personal data is being processed. […] One of the provisions of this directive […] addresses the transfer of personal data to any country outside of the EU.” (Balaji, 2005). In most cases, European companies transferring personal data to an overseas outsourcing provider would need to assure the contractor follows the EU rules for handling and processing the data. The EU is also in the process of pre-certifying certain countries for properly handling personal data according the directive standards. Businesses in the Philippines have been providing outsourcing solutions for information technology businesses for over a decade. Estavillo (2006) states the government has increased its focus on keeping the outsourcing landscape fertile in the Philippines. It has created an optional certification program for local businesses based on the government’s own guidelines for protection of information used in data processing and communications systems. The government hopes to continue to expand its reach into enforcing data protection by penalizing unlawful activities such as data breaches and unauthorized access to data intensive systems. Recently ISO has started an international certification effort called ISO 27001. The purpose of the certification is to prove a company documents and follows information security practices and controls. Ely (2008) points out that an ISO 27001 audit is against the processes of the outsourcing provider’s choosing, and to make sure the outsourcing firm follows the industry’s best practices and compliance guidelines of the home country and it deeply understands them. Often an overseas company will adopt HIPAA or Payment Card Industry (PCI) standards for handling of personal data and be certified against that standard for ISO 27001. Any size company can be certified under this standard, and there are no international restrictions regarding who may be certified. Outsourcing work in the information technology industry almost always includes the access or transfer of data between the client organization and the outsourcing provider. Voice conversations and movement of data over an international connection can be subject to interception and monitoring by U.S. and foreign surveillance programs. Ramstack (2008) finds that “[…] paralegal firms in India are doing a booming business handling the routine legal work of American law firms, such as drafting contracts, writing patents, indexing documents or researching laws.” A lawsuit was filed in May of 2008 that requests a hold on new legal outsourcing work until outsourcing companies can provide assurances that data transferred overseas can be protected against interception by U.S. and foreign intelligence collection agencies. The fear is that private legal information about citizens could be transferred from intelligence agencies to law enforcement agencies in the same or allied countries. The mix of international standards and laws offer little hope of legal action across borders when personal data is misused or illegally accessed. The flood of competition among overseas outsourcing companies does offer some hope that reputations are extremely important for sensitive outsourcing agreements. Once an outsourcing provider has been tainted with a bad reference for bulk data processing of foreign citizen’s medical information, for example, it will limit the firm’s financial upside until its reputation can be rebuilt. All of the focus should not be only on the outsourcing provider. It is important for an organization to define and understand its own processes involving data privacy internally before beginning an outsourcing agreement. People within the business who work around and regularly handle private data should be included early in the process of defining the requirements about outsourcing information-related work. These contributors can include the IT and business controls staff members and staff supporting the efforts of the CIO’s office. A cross-company team should define the conditions needed to work with private data regardless of the outsourcing group - local or overseas. They can also help define constraints placed on the outsourcing service provider. “Ensure that the contractual arrangement covers security and privacy obligations. Include language in the contract to articulate your expectations and stringent penalties for violations. Review your provider’s organizational policies and awareness training for its employees.” (Balaji, 2004). Large outsourcing providers may chose to outsource their work to smaller companies in their local country. It is important to be able to control the primary outsourcing company’s ability to subcontract work to other providers or to require that data handling standards in the contract are transitive to all subcontractors who may become involved, at the risk of the original outsourcing provider. In this case it is also important to have the outsourcing service provider identify in advance all or most of the subcontractors involved to obtain references. It is important to define in the outsourcing contract what happens when the relationship terminates. The transition plan for the end of the outsourcing agreement must include a process for obtaining control of data transferred to the outsourcing provider from the customer organization. There should be a way to return the data to the customer organization or assure its destruction on the outsourcing provider’s information systems. Although it has been a part of business for as long as there has been business, outsourcing in the information age brings with it new risks as well as opportunities for business cost optimization and scaling. Risks in outsourcing information services for private data can be mitigated partially through a detailed contract in addition to outsourcing vendor transparency. The best way to ensure compliance to contractual terms is to be sure the customer organization understands their own data privacy standards and treats all outsourcing situations with the same requirements followed internally. The customer organization should perform or obtain third-party audit reports of the outsourcing provider’s processes and systems for ongoing reassurance of proper handling of private data. References Balaji, S. (2004). Plan for data protection rules when moving IT work offshore. Computer Weekly. 30 November 2004, Pg. 26. Ely, A. (2008). Show Up Data Sent Offshore. INFORMATIONWEEK, Tech. Tracker. 2 June 2008, Pg. 37. Estavillo, M. E., Alave, K. L. (2006). Trade department prods outsourcing services to improve data security. BusinessWorld. 9 August 2006, Pg. S1/1. Klosek, J. (2005). Data privacy and security are a significant part of the outsourcing equation. Intellectual Property & Technology Law Journal. June 2005, 17.6, Pg. 15. Outsourcing. (n.d.). Dictionary.com Unabridged. Retrieved June 23, 2008, from Dictionary.com website: http://dictionary.reference.com/browse/outsourcing. Ramstack, T. (2008). Legal outsourcing suit spotlights surveillance fears. The Washington Times. 31 May 2008, Pg. 1, A01. Sharma, A. (2008). Mind your own business. Accountancy Age. 14 February 2008, Pg. 18. ...

June 28, 2008 · 9 min · 1754 words · Jim Thario

Research Essay on Signaling System 7

This research paper describes a telecommunications standard called Signaling System 7 (SS7). This technology defines a signaling system for control and routing of voice calls between telephone switches and switching locations. SS7 uses out-of-band signaling to place and control calls. It replaces an older system of in-band signaling to control telephone equipment. In-band signaling means the audio channel is used as a control channel for telephone switches. Operators would use tones over the audio channel to connect switches and open paths to the call destination. The use of out-of-band signaling means that control of creating an audio path through telephone switches is performed through a separate data channel that connects the switches together. The caller does not have access to this signaling channel, as they do for in-band signaling. SS7 can also carry data to switching locations about the calls they route. This data can include information for purposes of billing network time back to the call’s originating network and the caller’s account. “Signaling System 7 (SS7) is a set of telephony signaling protocols that are used to set up and route a majority of the world’s land line and mobile public switched telephone network (PSTN) telephone calls.” (Ulasien, 2007). SS7 provides more efficiency and reliability for call handling than in-band signaling. SS7 controlled calls can verify that the audio path for a call is ready to initiate, for example, and not create the audio path until the call is answered at the other end. Another example is if the destination phone number returns a busy signal, no audio path needs to be established and the switch directly connected to the caller can generate the busy sound. The strategy of delaying the creation of the audio path until the last moment prevents wasted bandwidth within the switching infrastructure. This scenario would not be possible with in-band signaling, since in-band signaling depends on having an audio path established prior to anyone answering the other end of the call. SS7 allows the creation of innovative customer features and the use of rules-based capabilities for call routing that were previously impossible with in-band signaling technology. Signaling System 7 began development in the 1970s and saw wide deployment beginning in the early 1990s. The technology research and development was sponsored by AT&T and originally named the Common Channel Signaling System (CCSS). AT&T proposed it to the International Telecommunications Union as a standard beginning in 1975. SS7 was issued as a standard in 1980 and has been refined three times since. The ITU Telecommunications Standardization Sector (ITU-TS) develops global SS7 standards. The ITU allows different countries or organizations to make their own refinements and extensions to the global SS7 standard. The American National Standards Institute (ANSI) and Bellcore define a regional SS7 standard for North America and Regional Bell Operating Companies (RBOCs). Before the adoption of Signaling System 7, the only path between telephone switches was the audio channel. Telephone operators would use in-band signaling to set up long distance calls, or route international calls over cable or satellite using touch-tones. Maintenance crews would put telephone switches into special modes using sequences of tones to turn off accounting or allow operations a normal user would not be able to perform. In-band signaling is not just used to control telephone switches. We encounter in-band signaling often through the use of telephone-based services from vendors. Call routing through most of today’s large corporate phone systems require extensive use of the touch-tone keypad. Most voicemail systems require us to enter our personal identification numbers using tones to access messages. Your bank might provide a system to check your balances or transfer money through a phone-based system that uses touch-tones to enter your account information and direct your choices. In-band signaling works well for low-bandwidth situations, such as entering an account code or choosing a menu. Routing instructions to telephone switches can result in a complex series of tones representing access codes and phone numbers. Although it is useful for vendors in providing self-service capabilities to customers, in-band signaling for mission-critical systems such as unprotected telephone switching networks, have been exploited. Exposure of the signaling channel meant that sometimes callers would discover and record the in- band signaling tones used to route calls and control switches. Sometimes the audio signals were discovered completely by accident. During the 1970s and 1980s people such as John Draper (Captain Crunch) were known for their little home-built boxes that could connect to telephone jacks and send sequences of tones to obtain free long distance calls. These were known as black boxes or blue boxes. A whistle that came as a prize in his cereal inspired John Draper’s blue box creation. “The box blasted a 2600-Hz tone after a call had been placed. That emulated the signal the line recognized to mean that it was idle, so it would then wait for routing instructions. The phreaker would put a key pulse (KP) and a start (ST) tone on either end of the number being called; this compromised the routing instructions, and the call could be routed and billed as a toll-free call. Being able to access the special line was the basic equivalent to having root access into Bell Telephone.” (Cross, 2007). Signaling System 7 moves the signaling channel out of the audio channel, and is no longer is accessible to the parties participating in the call. SS7 specifies that telephone switches connect together using a dedicated digital network used only for signaling and managing calls. The signaling network among switches is similar to a traditional computer network. The signaling network can be designed for redundancy and does not need to take the same physical path as the voice data paths. In addition to relocating the signaling channel, the protocol allows for the creation of new and innovative features related to how calls are controlled and routed through the network. The Intelligent Network is a telecommunications industry term and described by Zeichick (1998) as having more reliance on digital technologies, more contextual information about calls in addition to the voice data, and more control provided to the end user for controlling how their telephone experience works. Caller ID works, for example, because the originating caller information is passed from switch to switch through the signaling channels. As mobile phone callers move around, SS7 signaling protocol helps switches find the proper route for calls to this person’s phone. The destination switch for a mobile phone moving in a train or automobile can change quickly. Call routing between switches is optimized with SS7’s definition of shared databases that are accessed through the signaling network. The databases contain rules about how calls should be routed to their destination. Switches on an SS7 network can query shared databases to find out which provider owns a phone number and how to route the call to that number. The databases can also contain feature-specific information. This aspect of the SS7 implementation has been characterized as client-server, meaning the switches act as clients to the shared databases with rules and other information for managing calls. “SS7 links the telephone system with a client-server computer architecture to create a distributed, efficient and easily modified telephone infrastructure. The computers use information from common databases to control call switching and to allow the transfer of messages within the system.” (Krasner, 1997). New technologies are testing the longevity of the Signaling System 7 protocol. Packet switched voice over IP is causing some disruption in SS7 space. However, there is more emphasis on integration and signaling gateways than replacement of existing SS7 infrastructure with something more recent. Session Initiation Protocol (SIP) is a signaling protocol for controlling audio and video connections over Internet Protocol networks. It can be implemented in hardware or software. SIP can be used for voice, video conferencing, and instant messaging and other types of streaming multimedia. H.323 is another streaming multimedia signaling protocol used for audio and video over Internet packet networks. Microsoft’s NetMeeting application uses H.323 as its protocol to connect NetMeeting nodes together in a wide-area conference. H.323 is also a recommendation by the ITU-TS. The business value of SS7 is that it provides opportunities for security, efficiency and optimization of call routing, and it provides the foundation to build innovative features for call handling using contextual information about calls and shared databases. It is a standards-based protocol and has been used throughout the world’s established telecommunications providers for over a decade. The protocol defines the means by which telephone switches exchange call routing and feature information - it does not assume voice data is carried on any particular medium as calls are transferred through the system. This simple abstraction with SS7 allows it to work with new technologies as they arrive in the mainstream. It is possible for SS7 to work within a mixed-technology environment including circuit-switched and packet-switched data networks. Ulasien (2007) says that the extensibility of SS7 allows the incremental migration of an organization from circuit switched to packet switched calls. The voice network is turning into the streaming media network and SS7 will continue to be tested in its role of connection maker and gateway to more recent communication technologies such as VOIP and video conferencing. References Cross, Michael. (2007). Developer’s Guide to Web Application Security. Syngress Publishing 2007. ISBN:9781597490610. Hewett, Jeff. (1996). Signaling System 7: the mystery of instant worldwide telephony is exposed. Electronics Now. 67.n4 (April 1996): 29(7). Krasner, J. L., Hughes, P. & Klapfish, M. (1997). SS7 in transition. Telephony. 233.n14 (October 6, 1997): 54(4). Ulasien, Paul. (2007). Signaling System 7 (SS7) Market Trends. Faulkner Information Services. Document 00011475. July 2007. Zeichick, Alan. (1998). Lesson 125: Signaling System 7. Network Magazine. December 1, 1998: NA. ...

May 30, 2008 · 8 min · 1612 words · Jim Thario

Concepts and Value of the "4+1" View Model of Software Architecture

This essay describes the concepts and value of the “4+1” View Model of Software Architecture described by Philippe Kruchten in 1995. The purpose of the 4+1 view model is to provide a means to capture the specification of software architecture in a model of diagrams, organized into views. Each view represents a different concern and diagrams within each view use a diagramming notation suitable for that diagram’s purpose. The answers provided in each view answer questions related to the structure, packaging, concurrency, distribution, and behavior of the software system. The “+1” is a view of the scenarios and behavior of the software being described. This view drives development of the other views. The value the 4+1 view model approach brings to software architecture is that it is not specific to any class of software system. The principles behind the 4+1 view model can be applied to any scale of software system, from embedded software to web applications distributed across many collaborating servers. The software architecture of business IT systems can be represented using the 4+1 view model. What is a model? “A model plays the analogous role in software development that blueprints and other plans (site maps, elevations, physical models) play in the building of a skyscraper.” (OMG, 2005) Software can be specified using just textual requirements or it can be shown as a model of collections of diagrams with textual notes describing specific details. Models provide a filter for humans to deal with a lot of information at one time. Models give us a big picture, just like a blueprint does. Diagrams within a model can be organized by subject, purpose, or locality within a system. For building construction, a single page in a roll of blueprints might describe the routing plan for plumbing or electrical conduits. A different page might detail the foundation. Likewise, a diagram within a model might show us the structure of the database. A different diagram will show where each piece of the software runs on a network. The content of diagrams in models can be at any level of “zoom” to describe parts of the software. Simple data structures can be described in a diagram, as can complex scenarios carried out by several servers in synchronization. Kruchten’s purpose in the 4+1 view model is to capture and document the software’s architecture using diagrams organized in several views. What is software architecture? “Software architecture is the principled study of the overall structure of software systems, especially the relationship among subsystems and components.” (Shaw, 2001) I interpret the word “relationship” in this context to mean many possible kinds of relationships. One kind of relationship between subsystems is where one subsystem relies on the services of another subsystem. There can be a behavioral relationship among subsystems, where the protocol of messages between them must be documented. Another type of relationship among subsystems is collocation - how do they communicate? Can they communicate? What is the mechanism used to store transaction data and are the interfaces and support code packaged within each subsystem to allow data storage to happen? These are all questions answered by information at the level of software architecture. “Software architecture is concerned with the high-level structures of a software system, the relationships among them, and their properties of interest. These high-level structures represent the loci of computation, communication, and implementation.” (Garlan, 2006) A driving force behind the 4+1 view model is that a single diagram cannot communicate information about all the different kinds of relationships within a software system. A diagram that showed all the different concerns of a software’s architecture simultaneously would be overwhelming. Each view in the 4+1 view model has a different concern or subject. Multiple diagrams can exist within each view, like files exist within a folder by subject. Modeling and diagramming tools are used to create diagrams and organize them when applying the 4+1 view model. Many tools exist to build diagrams, including Microsoft Visio (VISIO, 2008), Enterprise Architect (EA, 2008) and Rational Software Architect (RSA, 2008). Kruchten uses a notation called the Booch notation in his paper to capture information in his diagrams for each view. Since Kruchten wrote his paper over ten years ago, the Booch notation has been refined and was contributed into the Unified Model Language specification from the Object Management Group. The 4+1 views are the logical view, process view, development view and physical view. The “+1” view contains the scenarios that represent the system’s interaction with the outside world. The scenarios are requirements. They drive the development of the other views of the architecture. The logical view contains the decomposition of the system into functions, structures, classes, data, components and layers. Kruchten points out that several different types of diagrams might be necessary within the logical view, to represent code, data, or other types of decomposition of the requirements. Mainly the scenarios, or “+1” view influences development of this view. The logical view is needed by the development and process views. The process view is concerned with the actual running processes in the deployed system. Processes are connected to each other through communication channels, like remote procedure calls or socket connections. Elements within the logical view run on processes, so there is traceability from the process view back to the logical view. Some projects, like the development of a code editor, will not require a process view since there is only one process involved. The third view is the development view. The scenarios and the elements in the logical view drive the contents of the process view. The development view documents the relationships and packaging of the elements from the logical view into components, subsystems and libraries. Diagrams within a development view might show which classes or functions are packaged into a single archive for installation. The diagrams within the development view should allow someone to trace back from a package of code to elements in the logical view. Dependencies among packages of code are documented in this view also. The fourth view is the physical view and it is created from the scenarios, process view and development view. The fourth view shows the allocation of packages of code and data, and processes to processing nodes, e.g. computers. The relationship between nodes is also shown in this view, usually in the form of physical networks or other physical data channels that allow processes on different nodes to communicate. The final “+1” view is the scenarios, which represent requirements for the behavior of the system. Kruchten’s paper shows examples using object scenario and object interaction diagrams. One could also use classic flow charts, use cases or UML activity diagrams to capture the scenarios of the software system. At a minimum, the scenarios should document how the system behaves and interacts with the outside world, either with people or with other systems. The information captured within a “4+1” View Model of Software Architecture is common to all software systems and can be applied as a general approach to document and communicate about information systems. Business information systems are very often database-centric, and use fat-client or web-based interfaces to enter, search, update and remove data. A business system can enforce a workflow of approvals before it allows a transaction to complete. Data warehousing solutions exist to archive, profile and find patterns in data for new. Many businesses are deploying self-service web sites for customers to interact with their business without constraining the customer to specific times a transaction can take place. Each of these qualities of business systems can be captured with one or more views of the “4+1” model. A logical view can be used to document the database schema, code modules, and even individual pages of content within a web solution. The development view for a J2EE solution would document how HTML files, JSP files, and Java code is packaged into archive files before deployment to the application server. The process view for a client-server database system would show code modules assigned to the user’s application process. The database schema and stored procedures would be assigned to the relational database server processes. Finally, a physical view of a web-based database application would show separate servers for the web and database. The web server process from the process view would be assigned to the web server node, as would the packages of HTML, CGI and other code in the development view. The physical view would also show a similar traceability for the database server node. The value of “4+1” View Model of Software Architecture is that it serves as general guiding principles to answer the question of what needs to be documented at a minimum when describing software architecture. Each view within the model has a well-defined subject or concern for the diagrams that are organized within the view. All software can be described in terms of behavior, structure, packaging and where it executes. These are the basic qualities the 4+1 view intent to document for easier human consumption. There are no official constraints to the notation styles that can be used by diagrams in each view. When applied to larger systems the logical view will contain many types of diagrams. The notation independence makes it a very flexible approach to use for many styles of software. When it is taught to a team along with diagramming skills, it can be used as significant form of communication and provide clarity among software project team members when creating new or documenting legacy IT projects. References Garlan, D., Schmerl, B. (2006). Architecture-driven Modeling and Analysis. 11th Australian Workshop on Safety Related Programmable Systems (SCS ’06). Kruchten, P. (1995). Architectural Blueprints - The “4+1” View Model of Software Architecture. IEEE Software 12 (6), 42-50. Object Management Group. (2005). Introduction to OMG UML. Retrieved May 10, 2008 from http://www.omg.org/gettingstarted/what_is_uml.htm. Rational Software Architect product page. (2008). Retrieve May 10, 2008 from http://www-306.ibm.com/software/awdtools/architect/swarchitect. Shaw, M. (2001). The Coming-of-Age of Software Architecture Research. IEEE. 0-7695-1050-7/01. Sparx Systems home page. (2008). Retrieved May 10, 2008 from http://www.sparxsystems.com.au. ...

May 30, 2008 · 8 min · 1665 words · Jim Thario

m0n0wall traffic shaping

In this article I will discuss my configuration for traffic shaping using m0n0wall. My goals for traffic shaping include giving priority for VOIP traffic leaving my network and limit the combined incoming traffic speed destined for my servers. Some of my assumptions are that you know how to configure your LAN and WAN networks in m0n0wall, you have NAT configured for your outbound LAN network traffic, and you are using the DHCP server for your LAN. The following image shows my LAN network configuration. From m0n0wall The DHCP server for my LAN network is configured to offer addresses from 192.168.85.100-192.168.85.199. I can’t ever imagine having more than 100 clients on my network. I use the addresses below .100 for static assignments on my LAN. My three servers are configured for static addresses on the LAN - they do not use DHCP. In addition to the three servers, the wireless access points are configured for static LAN addresses and the VOIP telephone adapter uses a fixed DHCP LAN address. I use inbound NAT for my Internet services to redirect HTTP, HTTPS and SMTP from the public firewall IP address to the desired server on the LAN. The following image shows the inbound NAT configuration. You will see HTTP and HTTPS are redirected to one server and SMTP is redirected to another server. In addition to these rules, m0n0wall will add rules to the firewall to allow this traffic to pass. From m0n0wall The VOIP telephone adapter uses DHCP by default and I wanted to maintain the provider’s default configuration for the device. My strategy was to determine the network MAC address of the VOIP device and set the m0n0wall DHCP server to always offer the device the same LAN IP address. The following image shows the settings for the m0n0wall DHCP server for the VOIP adapter. From m0n0wall From this configuration, I can now create rules in the traffic shaper to manage inbound and outbound traffic speed based on the LAN IP address. The first task is to define the pipes that will control inbound and outbound traffic. I have two pipes defined - one for all outbound traffic and one for inbound server traffic. I was able to verify my outbound Internet speed at about 1.5 Mbit. I subtracted about 6% from that and came up with 1434 Kbit. I talk about why you should do this in a previous article. The basic idea is that you only want to queue packets in your m0n0wall and prevent packets from queuing in your ISP router or any other device before the packet leaves your location. The only way to be sure is to throttle-down your outbound speed by a few percent. Your connection may need more or less, and you should experiment and re-test your settings once or twice a year. The second pipe is used to limit the maximum speed of incoming data to the servers. I want to limit the combined inbound traffic to all three of the servers to about 1 Mbit. The traffic that would pass through this pipe includes incoming mail delivery and incoming requests to the web server. This pipe will not impact web server responses, i.e. page content returned. Mail delivery between servers on the Internet happens asynchronously, so the client workstations will not care if a message delivery takes 1 second or 15 seconds to occur. Client workstations are interacting with servers on the local network, so they will not feel any of the shaping. From m0n0wall The strategy for outbound traffic is to give top priority for VOIP, second priority to workstations and last priority to outbound server traffic. To accomplish this I need three queues in the m0n0wall traffic shaper section. The three queues relate to the three outbound priorities previously mentioned. The first queue is for VOIP and has a weight of 50. The second queue is for workstation traffic and has a weight of 40. The last queue is for outbound server traffic and has a weight of 10. The total weight for all three queues adds up to 100 and the weights are completely relative. All three queues are connected to the outbound 1434 Kbit pipe. If there is no outbound VOIP and workstation traffic, the server queue with the weight of 10 will get the entire 1434 Kbit outbound pipe. See the following image for the queues. From m0n0wall The reality is that the VOIP traffic only takes about 100 Kbit of the outbound traffic when in use. Even though the weight of the high priority queue is set to 50, it will never use 50% of the 1434 Kbit outbound pipe, and all it does is guarantee that the VOIP service will get all the outbound bandwidth it needs. The final piece of the traffic shaping strategy is the rules that place outbound packets in a specific queue, or place inbound server traffic into the server pipe. Inbound VOIP and workstation traffic does not get shaped. The rules I use are based on traffic leaving a specific interface. Traffic leaving the WAN interface is traffic sent out to the Internet. Traffic leaving the LAN interface is traffic received from the Internet. With that, see the following image. From m0n0wall The first five rules are for outbound traffic destined for the Internet. Rule 1 places outbound VOIP traffic in the queue with weight 50. Rules 2-4 place outbound server traffic in the queue with weight 10. Rule 5 is a catch-all and places all other outbound traffic in the medium priority queue with weight 40. Rules 6-8 are for traffic leaving the LAN interface, in other words, inbound traffic from the Internet. These rules place traffic destined for my three servers into the 1 Mbit inbound pipe. These rules will constrain the combined inbound traffic to these servers to 1 Mbit. Only the inbound server traffic is shaped. With these pipes, queues and rules, I’ve accomplished my goal - VOIP traffic leaves first, workstation traffic leaves second, and server traffic leaves last, and inbound server traffic is limited to 1 Mbit. How can I tell if these rules are working? m0n0wall has a status.php page and you can see the byte and packet counts on these rules. To see these statistics, sign-in to your m0n0wall web console. Add status.php to the browser address. The page you will see is just a textual dump of various internal statistics. The statistic you want to review is the ipfw show listing. The following image shows the statistics for my traffic shaper rules. From m0n0wall In this image you can see the queue and pipe rules with their packet and bytes counts. Take note of the out via dc0 and out via dc1 parts of the rules, which are my WAN and LAN network adapters. The first two rules and very last rule are automatically added by the m0n0wall software. You can see the queue 1 rule for high priority outbound VOIP traffic, coming from a specific LAN address. The next three rules for queue 3 are for low priority outbound server traffic, again based on LAN address. The queue 2 rule is the catch-all rule for outbound workstation traffic at medium priority. The next three rules are for inbound server traffic that is sent to the 1 Mbit pipe. All other inbound traffic is not shaped and matches the last rule. ...

March 4, 2008 · 6 min · 1231 words · Jim Thario