Global Horizons for Big Data

1
Towards Globally-Distributed Data

1 Global Horizons for Big Data
Big Data is increasingly behind major business decisions [14]. It is now common
for a medium sized company to process terabytes of data a day; and for the result-
ing analytics to support services and management decisions. This trend drives the
analytic database market, which is a major growth area in the multi-billion dollar
database market.
To tackle Big Data we must adopt scalable systems. Parallel databases are lim-
ited in their scalability, since they cope poorly with node failures that are common
on scales of more than a dozen nodes. Thus fault tolerance is key to scalability.
Google was amongst the first organisations that built systems that work with
data on a big scale. Thus the state of the art is heavily inspired by Google’s
technology stack, namely Chubby [5], MapReduce [9], Google File System [10]
and BigTable [7]. These technologies enable data to be processed in a warehouse
scale datacentre, involving tens of thousands of machines persisting petabytes of
data. The academic community is also pushing Big Data further with projects
covering warehouse scale query processing [2, 16] and column stores [15, 1].
Google is now working with data on a still bigger scale. To cope with dis-
asters that damage or disrupt datacentres, Google’s flagship products, such as F1
(Google’s advertising backend), require data to be replicated globally [3, 8]. These
products are required to scale to millions of nodes across hundreds of datacentres.
Global replication brings data closer to consumers, thus improving read perfor-
mance. However, due to fluctuating high latency in a Wide Area Network, global
replication brings significant new challenges. In particular, minute attention to
time is required to maintain adequate consistency.
1
2 The State of the Art
In lectures, we have covered the key technologies that allow Google to process
data on a warehouse scale. Fault tolerant parallel computations are delivered by
MapReduce [9]. Fault tolerant distributed data persistence is provided by the
Google File System (GFS) [10]. Fault tolerant distributed indexing is provided
by BigTable [7]. Many of Google’s services, e.g. Google Personalized Search,
Google Analytics and Google Earth, fit a particular computational model. The
application uses BigTable to lookup data to extract from GFS, which is input into
a MapReduce job. The output of the MapReduce job is then persisted and indexed
using BigTable on top of GFS.
The open source community has developed several projects based on Google’s
scalable systems. Apache Hadoop, the Hadoop File System (HDFS) and HBase
are open source clones of Google’s MapReduce, GFS and BigTable respectively.
Thus anyone can deploy a clone of Google’s systems. For example, Facebook
deploy these Apache Hadoop technologies to manage their multi petabyte data
warehouses [4].
Hadoop, HDFS and HBase all hold data critical for operation and recovery at
a single master node. If the master node were to become unavailable, the whole
system would temporarily become unavailable. Furthermore, catastrophic failure
leading to data loss at the master node could lead to the entire system becoming
unrecoverable. Google achieves a highly available system by removing critical
single points of failure using the Chubby locking service [5]. Chubby removes
the single point of failure, while maintaining consistency, by using the Paxos al-
gorithm [13, 6]. Apache Zookeeper fulfils the role of Chubby in the Hadoop
ecosystem.
3 Towards Globally Distributed Data
It is your job to take our investigation a step further by studying Spanner [8] —
Google’s latest database technology. Spanner is a database that is designed to
meet the data management requirements of Google’s growing suite of applica-
tions. Google’s applications must not only operate at scale, but also must guaran-
tee high availability and prevent data loss. Google achieves these requirements by
replicating data across several datacentres located in distinct geographical regions.
BigTable is designed with the assumption that all nodes are connected on
a high speed network in a single warehouse scale datacentre. Thus BigTable
2
assumes that the round trip between any two nodes is less than a millisecond
(see [7] Section 7). When replicating data between datacentres this assumption
no longer holds. Google’s quick intermediate solution was Megastore [3], which
builds directly on top of several BigTable instances, running in each datacentre.
Megastore is adequate to support several of Google’s flagship products, including
Gmail, Picasa, Calendar, Android Market and Google AppEngine. However, the
performance of Megastore, in particular write throughput, is poor.
MegaStore uses Paxos to replicate primary user data across datacentres on ev-
ery write [3]. Instead of building on top of BigTable, Spanner [8] is a redesign
of Google’s data infrastructure with global distribution at the core. Spanner im-
proves write performance while maintaining strong consistency. The notion of
strong consistency maintained is called external consistency. External consistency
is defined by Gi

ord [11] as follows:
External consistency guarantees that a transaction will always receive
current information. The
actual time
order in which transactions com-
plete defines a unique serial schedule. This serial schedule is called
the external schedule. A system is said to provide
external consis-
tency
if it guarantees that the schedule it will use to process a set of
transactions is equivalent to its external schedule.
Spanner implements a TrueTime API that uses multiple clock references to
keep the global error on actual time within 10 milliseconds. TrueTime is critical
to supporting external consistency, while maintaining good write performance.
Measuring time is known to be important for guaranteeing reasonable progress
for distributed algorithms [12]; however Spanner is one of the few systems where
the finest attention to time is critical. The critical nature of TrueTime in Spanner
gives us new insight into distributed systems as concluded by Corbett et al. [8]:
As a community, we should no longer depend on loosely synchro-
nised clocks and weak time APIs in designing distributed algorithms.
Thus Spanner is a milestone for computer science that demonstrates that time
should be part of the design of globally distributed algorithms.
4 Your Challenge
Your work should be submitted as a technical report written in English. The page
limit is 10 pages. In your report, put yourself in the situation of an systems analyst
3
that is extracting the technical requirements of a scalable globally distributed data
centric system inspired by Google Spanner. You should address the following four
points:
Briefly outline the state of the art for working with Big Data and explain
the role of Paxos in BigTable. Also speculate about how Paxos can be used
to make the master node more reliable in MapReduce. Remember that the
master node holds critical routing and status information about all the nodes
and tasks involved in a MapReduce job. You will require the papers on
BigTable [7], Chubby [5] and MapReduce [9] to answer this question.
Read the paper on Spanner [8]. Use the paper to explain how Spanner uses a
modified Paxos algorithm to replicate primary user data across globally dis-
tributed datacentres. Describe how this modified Paxos algorithm is di

er-
ent from the basic Paxos algorithm [13]. Please use your recently obtained
background knowledge, to expand beyond the explanation of Paxos given
in [8] Section 2. In particular, provide details about how the communication
phases of the modified Paxos algorithm used in Google Spanner work.
Explain what the TrueTime API is and how TrueTime is used to improve the
write performance of Spanner while maintaining external consistency. In
particular, identify how TrueTime is used in the Paxos algorithm described
in the previous bullet point. You should interpret the paper on Spanner so
that developers wishing to implement an (open source) clone of Spanner
can identify the requirements. TrueTime and its usage is covered in Section
3 and Section 4 in [8].
Conclude, by comparing the use of Paxos in Spanner to the use of Paxos in
BigTable. Argue the case for developing an open source clone of Spanner,
by highlighting the benefits of global distribution. Finally, summarise the
technical challenges identified in the body of the report for developing such
a Spanner clone.
The deadline for this report is Tuesday 14th May. Please start reading the
paper on Spanner as soon as possible — it will take several readings to understand.
Also, please contact me before the deadline if you have di
culty understanding
any of the points above.

ORDER THIS ESSAY HERE NOW AND GET A DISCOUNT !!!

Place an order today and get 13% Discount (Code GAC13)

 

 

new ways that you believe will eradicate corporate fraud.

The Influence of Female Leaders toward Organizational Performance in ASEAN (AEC)
Read the case study titled ?F&C International? prior to doing this assignment.
1. Over time, there has been significant legislation passed, such as Sarbanes-Oxley, yet corporate
fraud is still pervasive in today’s business environment. Suggest three (3) new ways that you
believe will eradicate corporate fraud.
2. In the F&C case, inventory manipulation was used to enact the fraud. Discuss the proper internal
controls needed over inventory and how these controls will act as a deterrent to fraudulent
activities.
3. For a moment, step into the shoes of Catherine Sprauer at F&C International. Indicate what you
would have done following each of the confrontations she had with the two employees who
insisted that F&C executives were involved in a fraudulent scheme to misrepresent the
company’s financial statements.
4. Discuss how accounting firms should modify their audit procedures to ensure the risk of financial
fraud is minimized.
5. Discuss how the Securities and Exchange Commission (SEC) continues to fail to detect
fraudulent activities in publically traded companies. Suggest a recommendation for improvement.
6. Evaluate whether legislation and regulatory agency oversight will increase or decrease corporate
fraud. Explain your position.

ORDER THIS ESSAY HERE NOW AND GET A DISCOUNT !!!

Place an order today and get 13% Discount (Code GAC13)

 

 

Regional Trade Agreements versus Global Trade Liberalization

Regional Trade Agreements versus Global Trade Liberalization

 

In the globalizing economy of the late 20th and early 21st century liberalized trade has been sought by way of regional trade agreements and broader global trade liberalization. The policy choice between these two approaches has created debates among economists and politicians concerning which is a better strategy for various countries and for the global economy as a whole, and whether the two approaches are complimentary or contradictory.

Begin by reviewing pages VII and VIII in the Module 5 lecture on developing an annotated bibliography through the structure of a research system.

Using the CSU-GC Virtual Library, the Web, and/or other sources of scholarly literature, begin your research into regional trade agreements versus global trade liberalization by locating professional and academic journals and select current research articles published

ORDER THIS ESSAY HERE NOW AND GET A DISCOUNT !!!

Place an order today and get 13% Discount (Code GAC13)

 

Assess the value of having a Supplier Code of Conduct when outsourcing operational functions to international markets and the enforceability of such a code.

Detecting Unethical Practices at Supplier Faculty
Write a one (1) page paper that addresses the three (3) questions written below. Each question (header) must be placed as a Section Heading Topic within the paper (for example: Question #1, would be the topic and below the topic in a separate paragraph would follow the response to the question). Also, all parts of the question must be answered. Lastly, the require number of pages are annotated with each question, listed below.

 

Questions:

1. Assess the value of having a Supplier Code of Conduct when outsourcing operational functions to international markets and the enforceability of such a code.

2. Evaluate whether or not you believe a U.S.-based company outsourcing jobs to foreign markets is ethical. Support your position.

3. Assume that you have to make the decision to outsource work to a foreign market. Determine what country would be your best option. Explain your rationale

ORDER THIS ESSAY HERE NOW AND GET A DISCOUNT !!!

Place an order today and get 13% Discount (Code GAC13)

 

Case: Outputs Diagnosis, SLP: Time Warp 3 Case for Whole Foods using Nadler-Tushman Congruence Model:

Case: Outputs Diagnosis, SLP: Time Warp 3
Case for Whole Foods using Nadler-Tushman Congruence Model:

Use the Nadler-Tushman Congruence Module to analyze Whole Foods Market’s outputs. Start with the Organizational Level. Identify the Outputs – what does it produce and sell? What are its goals? How has it been performing? Then go the group level. What are some groups that Whole Foods Market identifies, and the goals and performance of these groups. Finally, discuss the Individual level. Here you will find it difficult to get much detailed information, so identify five to seven key jobs and their outputs. How can the performance of these jobs be measured? Finally determine the congruence of the outputs and make a strong argument for your case.

identify the outputs of the organization at each of three levels. Also identify the goals that it has set and its current performance. Include the following:
? Outputs at the organizational level are the products and/or services that it provides to its customers. What are these and how does the company categorize them? How does it measure its organizational performance (e.g., sales, net profit, return on sales, return on assets, market share, customer satisfaction, etc.). Provide some specific performance data.
? What are some ways the company identifies groups? For example, are there geographic groups (or divisions), functional groups, etc. What are the outputs of these groups? How does it (or how might it) measure performance of these groups?
? What are some of the key individual functions, and what are their outputs? How do these outputs contribute to the group outputs? How do they measure individual performance?
? Evaluate how the outputs at the different levels interact with each other. Determine if you think the congruence of the outputs is high, medium, or low. Then make a strong case. It is very important that you support your position with evidence and information that you have discussed earlier in the report.

ORDER THIS ESSAY HERE NOW AND GET A DISCOUNT !!!

Place an order today and get 13 % Discount (Code GAC13)

 

CONSUMER CULTURE: GLOBALISATION, MATERIALISM & RESISTANCE

CONSUMER CULTURE: GLOBALISATION, MATERIALISM & RESISTANCE
Discuss the extent to which the globalisation of consumer culture engenders a cosmopolitan culture, where individuals show ‘openness toward a divergent cultural experiences’ (Hannerz 1990: 239). You should discuss your essay using theories of cultural globalisation introduced to you in the lecture and illustrate your arguments with supporting examples.

Key Theories and Suggestions:

• John Tomlinson (2003): Globalisation and Culture
• George Ritzer (2006): The McDonalisation of Society
• Adorno and Horkheimer (1944/2000): The Culture Industry
• Arjun Appadurai (1988): Indigenization
• Hannerz (1990): Cosmopolitanism
• Daniel Miller (1995): Worlds apart: Modernity through the prism of the local
address the question asked and not try to re-define, ‘twist it round’, or state it in more general terms -to allow you to write about something else. In order to remind yourself of this, always put the question addressed at the beginning of your work. ‘Not answering the question’ will result in a significant loss of marks.

• You must demonstrate ability to synthesize key theories and concepts and develop key themes and/or arguments.

• Your essay must be supported by illustrative examples and/or case studies. You are allowed to use supporting media and/or materials (such as images, sound and other medium).

• Being asked to discuss something is not the same as being asked to list statements. A discussion will consider alternative points of view and your own thinking and evaluation should be apparent in the discussion of the topic.

• Your essay must be properly referenced:
o only sources referred to specifically in the text of your answer should be included in the bibliography;
o all sources (including those for any numeric examples used) should be acknowledged;
o there should be no references in your answer to sources which are not in your bibliography BUT if you have not consulted the reference directly yourself you should indicate in the text of your answer the secondary source from which is comes. It is this secondary source that should be in your bibliography.

• Listing a reference in the bibliography does not make it acceptable to copy sections of the book into your answer unless it is explicitly stated as a quotation. You must summarise the points in your own words. Plagiarism is regarded as a most serious instance of academic misconduct and is dealt with accordingly.
• It is expected that you will consult academic and professional journals as well as textbooks. Many textbooks cover much the same information so consulting many different textbooks only results to duplicating this information. Textbooks tend not to have very up to date content and journal sources are vital for this.
• Before you write your essay, work out on paper a detailed outline of your argument.
• In the essay introduction, you should set out your main themes and intentions: describe the issue you are addressing, the illustrative case or contexts you are going to discuss (if any), identify its main components, and indicate what you are going to do in the body of your essay.
• Break down your arguments into main parts – use this as a basis of your essay that will then be divided up into several sections (you may want to have a section title for each section).
• Build up your argument point-by-point, section-by-section, so that you develop a picture that slowly develops in the reader’s mind.
• Always try to put yourself in the position of a critical reader, ask yourself how s/he would react to your essay, how s/he would understand it and be persuaded by it.
• Do not simply describe the ideas and literature you’re dealing with, provide a critical evaluation.
• Summarise your arguments in conclusion. What is the main significance of what you have been saying?
ORDER THIS ESSAY HERE NOW AND GET A DISCOUNT !!!

Place an order today and get 13% Discount (Code GAC13)

 

the mitigation analysis of unusable motor vehicles in Saudi Arabia

the mitigation analysis of unusable motor vehicles in Saudi Arabia
1 write about Saudi Arabia Government Act, Regulations, laws and systems regarding to unusable vehicles that are left on the roads for a long period of time.
2write about the legal or illegal situation regarding abandoned and derelict vehicles on public roads in Saudi Arabia.
3 – I want you to write about what is the opinion of Saudi Arabia government about leaving unusable vehicles on the roads and in the vehicle dump for a long period of time. (300 words)
4 – I want you to write crucial questions to be resolved and also write/make a questionnaire of this issue.

ORDER THIS ESSAY HERE NOW AND GET A DISCOUNT !!!

Place an order today and get 13% Discount (Code GAC13)