Questions tagged [mapreduce]
MapReduce is an algorithm for processing huge datasets on certain kinds of distributable problems using a large number of nodes
28 questions
4
votes
3
answers
312
views
Given two sparse vectors, compute their dot product
Problem Statement:
Given two sparse vectors, compute their dot product.
Implement class SparseVector:
SparseVector(nums) Initializes the object with the vector nums
dotProduct(vec) Compute the dot ...
4
votes
0
answers
472
views
Subclass of Python's multiprocessing.Pool which allows progress reporting
For context, the whole of the project code can be found here. This question was created specifically for the progress.py file.
The goal behind it is to allow ...
4
votes
2
answers
69
views
Put every object in a specified bucket
I have this array of objects:
...
0
votes
2
answers
5k
views
Javascript + Filter object of values
I have the object with values. I trying to filter based on values.
...
2
votes
1
answer
225
views
The best way for inserting multiple objects into array
I have a transformer helper function. It reduces over the array and transform key/value pairs. At the end of the loop there is the key 'EXAMPLE1' exists and I should insert two objects after the first ...
2
votes
1
answer
83
views
Efficiently aggregating nested data
Problem
Given the following data:
...
1
vote
2
answers
201
views
Extract unique words from given text and group by letter count
The task is for training go-lang. The idea is to extract unique words sorted and grouped by length. Might be useful in learning new words. The program uses command line argument assuming it's a file ...
2
votes
1
answer
87
views
Looping over a bidimensional array and extract data to a new one
I have a bidimensional array like this:
...
2
votes
1
answer
87
views
Building objects in javascript, without "if(!a[k]) a[k] = []"
When building objects using reduce, I often have crappy code like this:
...
2
votes
0
answers
338
views
Groovy Map/Reduce for Jenkins DSL
Jenkins DSL doesn't support collect and inject from what I can tell (I get missing method exceptions when I try), so I ...
3
votes
2
answers
816
views
Summarizing the score of a personality quiz
This function takes a list of questions and list of answers provided by the user.
The list of answers is always a list of booleans (for true and false) and the list of questions takes the following ...
6
votes
0
answers
91
views
Pyspark Solver for Tiered Board Games
I've written a Pyspark program that will completely solve a tiered board game (no loops, each game position is a member of only one tier) and writes each tier to a file. It also determines the ...
8
votes
3
answers
1k
views
Find Top 10 IP out of more than 5GB data
I have a few of files, and total size of them is more than 5 GB. Each line of the files is a IP address, looks like:
127.0.0.1 reset success
...
127.0.0.2 reset success
how can i find Top10 ...
3
votes
1
answer
123
views
Classifying and counting database entries using Scala map and flatMap
I am new to Spark and Scala and I have solved the following problem. I have a table in database with following structure:
...
5
votes
2
answers
665
views
Accepting user defined functions for custom map reduce functionality in C++
I am implementing map and reduce - style functions for processing geospatial raster datasets.
I would like the ...