CCA-500 | Latest CCA-500 Dumps 2020

we provide Validated Cloudera CCA-500 exam answers which are the best for clearing CCA-500 test, and to get certified by Cloudera Cloudera Certified Administrator for Apache Hadoop (CCAH). The CCA-500 Questions & Answers covers all the knowledge points of the real CCA-500 exam. Crack your Cloudera CCA-500 Exam with latest dumps, guaranteed!

Online CCA-500 free questions and answers of New Version:

You use the hadoop fs –put command to add a file “sales.txt” to HDFS. This file is small enough that it fits into a single block, which is replicated to three nodes in your cluster (with a replicationfactor of 3). One of the nodes holding this file (a single block) fails. How will the cluster handle the replication of file in this situation?

  • A. The file will remain under-replicated until the administrator brings that node back online
  • B. The cluster will re-replicate the file the next time the system administrator reboots the NameNode daemon (as long as the file’s replication factor doesn’t fall below)
  • C. This will be immediately re-replicated and all other HDFS operations on the cluster will halt until the cluster’s replication values are resorted
  • D. The file will be re-replicated automatically after the NameNode determines it is under- replicated based on the block reports it receives from the NameNodes

Answer: D

You want to node to only swap Hadoop daemon data from RAM to disk when absolutely necessary. What should you do?

  • A. Delete the /dev/vmswap file on the node
  • B. Delete the /etc/swap file on the node
  • C. Set the ram.swap parameter to 0 in core-site.xml
  • D. Set vm.swapfile file on the node
  • E. Delete the /swapfile file on the node

Answer: D

Cluster Summary:
45 files and directories, 12 blocks = 57 total. Heap size is 15.31 MB/193.38MB(7%)
CCA-500 dumps exhibit
Refer to the above screenshot.
You configure a Hadoop cluster with seven DataNodes and on of your monitoring UIs displays the details shown in the exhibit.
What does the this tell you?

  • A. The DataNode JVM on one host is not active
  • B. Because your under-replicated blocks count matches the Live Nodes, one node is dead, and your DFS Used % equals 0%, you can’t be certain that your cluster has all the data you’ve written it.
  • C. Your cluster has lost all HDFS data which had bocks stored on the dead DatNode
  • D. The HDFS cluster is in safe mode

Answer: A

You have a cluster running with the fair Scheduler enabled. There are currently no jobs running on the cluster, and you submit a job A, so that only job A is running on the cluster. A while later, you submit Job B. now Job A and Job B are running on the cluster at the same time. How will the Fair Scheduler handle these two jobs?(Choose two)

  • A. When Job B gets submitted, it will get assigned tasks, while job A continues to run with fewer tasks.
  • B. When Job B gets submitted, Job A has to finish first, before job B can gets scheduled.
  • C. When Job A gets submitted, it doesn’t consumes all the task slots.
  • D. When Job A gets submitted, it consumes all the task slots.

Answer: B

Table schemas in Hive are:

  • A. Stored as metadata on the NameNode
  • B. Stored along with the data in HDFS
  • C. Stored in the Metadata
  • D. Stored in ZooKeeper

Answer: B

Which YARN process run as “container 0” of a submitted job and is responsible for resource qrequests?

  • A. ApplicationManager
  • B. JobTracker
  • C. ApplicationMaster
  • D. JobHistoryServer
  • E. ResoureManager
  • F. NodeManager

Answer: C

For each YARN job, the Hadoop framework generates task log file. Where are Hadoop task log files stored?

  • A. Cached by the NodeManager managing the job containers, then written to a log directory on the NameNode
  • B. Cached in the YARN container running the task, then copied into HDFS on job completion
  • C. In HDFS, in the directory of the user who generates the job
  • D. On the local disk of the slave mode running the task

Answer: D

Assuming you’re not running HDFS Federation, what is the maximum number of NameNode daemons you should run on your cluster in order to avoid a “split-brain” scenario with your NameNode when running HDFS High Availability (HA) using Quorum- based storage?

  • A. Two active NameNodes and two Standby NameNodes
  • B. One active NameNode and one Standby NameNode
  • C. Two active NameNodes and on Standby NameNode
  • D. Unlimite
  • E. HDFS High Availability (HA) is designed to overcome limitations on the number of NameNodes you can deploy

Answer: B

You have a cluster running with a FIFO scheduler enabled. You submit a large job A to the cluster, which you expect to run for one hour. Then, you submit job B to the cluster, which you expect to run a couple of minutes only.
You submit both jobs with the same priority.
Which two best describes how FIFO Scheduler arbitrates the cluster resources for job and its tasks?(Choose two)

  • A. Because there is a more than a single job on the cluster, the FIFO Scheduler will enforce a limit on the percentage of resources allocated to a particular job at any given time
  • B. Tasks are scheduled on the order of their job submission
  • C. The order of execution of job may vary
  • D. Given job A and submitted in that order, all tasks from job A are guaranteed to finish before all tasks from job B
  • E. The FIFO Scheduler will give, on average, and equal share of the cluster resources over the job lifecycle
  • F. The FIFO Scheduler will pass an exception back to the client when Job B is submitted, since all slots on the cluster are use

Answer: AD

During the execution of a MapReduce v2 (MRv2) job on YARN, where does the Mapper place the intermediate data of each Map Task?

  • A. The Mapper stores the intermediate data on the node running the Job’s ApplicationMaster so that it is available to YARN ShuffleService before the data is presented to the Reducer
  • B. The Mapper stores the intermediate data in HDFS on the node where the Map tasks ran in the HDFS /usercache/&(user)/apache/application_&(appid) directory for the user who ran the job
  • C. The Mapper transfers the intermediate data immediately to the reducers as it is generated by the Map Task
  • D. YARN holds the intermediate data in the NodeManager’s memory (a container) until it is transferred to the Reducer
  • E. The Mapper stores the intermediate data on the underlying filesystem of the local disk in the directories yarn.nodemanager.locak-DIFS

Answer: E

You need to analyze 60,000,000 images stored in JPEG format, each of which is approximately 25 KB. Because you Hadoop cluster isn’t optimized for storing and processing many small files, you decide to do the following actions:
1. Group the individual images into a set of larger files
2. Use the set of larger files as input for a MapReduce job that processes them directly with python using Hadoop streaming.
Which data serialization system gives the flexibility to do this?

  • A. CSV
  • B. XML
  • C. HTML
  • D. Avro
  • E. SequenceFiles
  • F. JSON

Answer: E

Sequence files are block-compressed and provide direct serialization and deserialization of several arbitrary data types (not just text). Sequence files can be generated as the output of other MapReduce tasks and are an efficient intermediate representation for data that is passing from one MapReduce job to anther.

Assume you have a file named foo.txt in your local directory. You issue the following three commands:
Hadoop fs –mkdir input
Hadoop fs –put foo.txt input/foo.txt
Hadoop fs –put foo.txt input
What happens when you issue the third command?

  • A. The write succeeds, overwriting foo.txt in HDFS with no warning
  • B. The file is uploaded and stored as a plain file named input
  • C. You get a warning that foo.txt is being overwritten
  • D. You get an error message telling you that foo.txt already exists, and asking you if you would like to overwrite it.
  • E. You get a error message telling you that foo.txt already exist
  • F. The file is not written to HDFS
  • G. You get an error message telling you that input is not a directory
  • H. The write silently fails

Answer: CE

Which two features does Kerberos security add to a Hadoop cluster?(Choose two)

  • A. User authentication on all remote procedure calls (RPCs)
  • B. Encryption for data during transfer between the Mappers and Reducers
  • C. Encryption for data on disk (“at rest”)
  • D. Authentication for user access to the cluster against a central server
  • E. Root access to the cluster for users hdfs and mapred but non-root access for clients

Answer: AD

You decide to create a cluster which runs HDFS in High Availability mode with automatic failover, using Quorum Storage. What is the purpose of ZooKeeper in such a configuration?

  • A. It only keeps track of which NameNode is Active at any given time
  • B. It monitors an NFS mount point and reports if the mount point disappears
  • C. It both keeps track of which NameNode is Active at any given time, and manages the Edits fil
  • D. Which is a log of changes to the HDFS filesystem
  • E. If only manages the Edits file, which is log of changes to the HDFS filesystem
  • F. Clients connect to ZooKeeper to determine which NameNode is Active

Answer: A

Reference: Reference: 15)

You have a Hadoop cluster HDFS, and a gateway machine external to the cluster from which clients submit jobs. What do you need to do in order to run Impala on the cluster and submit jobs from the command line of the gateway machine?

  • A. Install the impalad daemon statestored daemon, and daemon on each machine in the cluster, and the impala shell on your gateway machine
  • B. Install the impalad daemon, the statestored daemon, the catalogd daemon, and the impala shell on your gateway machine
  • C. Install the impalad daemon and the impala shell on your gateway machine, and the statestored daemon and catalogd daemon on one of the nodes in the cluster
  • D. Install the impalad daemon on each machine in the cluster, the statestored daemon and catalogd daemon on one machine in the cluster, and the impala shell on your gateway machine
  • E. Install the impalad daemon, statestored daemon, and catalogd daemon on each machine in the cluster and on the gateway node

Answer: D

Each node in your Hadoop cluster, running YARN, has 64GB memory and 24 cores. Your has the following configuration:
You want YARN to launch no more than 16 containers per node. What should you do?

  • A. Modify yarn-site.xml with the following property:<name>yarn.scheduler.minimum-allocation-mb</name><value>2048</value>
  • B. Modify yarn-sites.xml with the following property:<name>yarn.scheduler.minimum-allocation-mb</name><value>4096</value>
  • C. Modify yarn-site.xml with the following property:<name>yarn.nodemanager.resource.cpu-vccores</name>
  • D. No action is needed: YARN’s dynamic resource allocation automatically optimizes the node memory and cores

Answer: A

Which scheduler would you deploy to ensure that your cluster allows short jobs to finish within a reasonable time without starting long-running jobs?

  • A. Complexity Fair Scheduler (CFS)
  • B. Capacity Scheduler
  • C. Fair Scheduler
  • D. FIFO Scheduler

Answer: C


You are running a Hadoop cluster with a NameNode on host mynamenode, a secondary NameNode on host mysecondarynamenode and several DataNodes.
Which best describes how you determine when the last checkpoint happened?

  • A. Execute hdfs namenode –report on the command line and look at the Last Checkpoint information
  • B. Execute hdfs dfsadmin –saveNamespace on the command line which returns to you the last checkpoint value in fstime file
  • C. Connect to the web UI of the Secondary NameNode (http://mysecondary:50090/) and look at the “Last Checkpoint” information
  • D. Connect to the web UI of the NameNode (http://mynamenode:50070) and look at the “Last Checkpoint” information

Answer: C

Reference: 10/hdfs

Your cluster has the following characteristics:
✑ A rack aware topology is configured and on
✑ Replication is set to 3
✑ Cluster block size is set to 64MB
Which describes the file read process when a client application connects into the cluster and requests a 50MB file?

  • A. The client queries the NameNode for the locations of the block, and reads all three copie
  • B. The first copy to complete transfer to the client is the one the client reads as part of hadoop’s speculative execution framework.
  • C. The client queries the NameNode for the locations of the block, and reads from the first location in the list it receives.
  • D. The client queries the NameNode for the locations of the block, and reads from a random location in the list it receives to eliminate network I/O loads by balancing which nodes it retrieves data from any given time.
  • E. The client queries the NameNode which retrieves the block from the nearest DataNode to the client then passes that block back to the client.

Answer: B

On a cluster running MapReduce v2 (MRv2) on YARN, a MapReduce job is given a directory of 10 plain text files as its input directory. Each file is made up of 3 HDFS blocks. How many Mappers will run?

  • A. We cannot say; the number of Mappers is determined by the ResourceManager
  • B. We cannot say; the number of Mappers is determined by the developer
  • C. 30
  • D. 3
  • E. 10
  • F. We cannot say; the number of mappers is determined by the ApplicationMaster

Answer: E

Which YARN daemon or service negotiations map and reduce Containers from the Scheduler, tracking their status and monitoring progress?

  • A. NodeManager
  • B. ApplicationMaster
  • C. ApplicationManager
  • D. ResourceManager

Answer: B

Reference: resource manager)

Identify two features/issues that YARN is designated to address:(Choose two)

  • A. Standardize on a single MapReduce API
  • B. Single point of failure in the NameNode
  • C. Reduce complexity of the MapReduce APIs
  • D. Resource pressure on the JobTracker
  • E. Ability to run framework other than MapReduce, such as MPI
  • F. HDFS latency

Answer: DE

Reference:, first para)


Recommend!! Get the Full CCA-500 dumps in VCE and PDF From DumpSolutions, Welcome to Download: (New 60 Q&As Version)