Iterators and Generators
- Please download the homework here
Overview
During this project, you will develop a class that implements an iterable for a list of businesses. You will be given a large set of business data and you are tasked with implementing a class that provides methods to initialize and transform an iterable over businesses. You will also be tasked with implementing two functions that create generator objects that extend the functionality of an iterable.
Learning Objectives
- Practice writing class-based TypeScript
- Work with iterators, iterables, and generators
- Learn about JavaScript Object Notation (JSON)
Student Expectations
Students will be graded on their ability to:
- Correctly implement the functions specified below
- Follow the coding, bad practice and testing guidelines
- Design full-coverage unit-tests for the functions they implemented
- See the testing guidelines on coverage for more details
Testing
You must write tests for all your functions, following the principles used so far.
Getting Started
JSON
JavaScript Object Notation (JSON) is a syntax for representing JavaScript objects, arrays, numbers, strings, booleans, and null as a complete (single) string. With this format, we can store runtime data on the disk or send it over a network.
Below is a JavaScript object,
const sandwichOrders = {
orders: [
{
orderId: 1234,
sandwiches: [
{
bread: "whole wheat",
vegetables: ["lettuce", "onion", "tomato", "hot peppers"],
meat: ["prosciutto", "salami"],
cheese: ["provolone"],
condiments: ["mayonnaise", "vinegar"],
isToasted: false,
},
{
bread: "white",
vegetables: null,
meat: ["bologna"],
cheese: null,
condiments: null,
isToasted: false,
},
],
total: 13.4,
},
],
};
and its JSON representation:
const str = JSON.stringify(sandwichOrders);
console.log(str);
// {"orders":[{"orderId":1234,"sandwiches":[{"bread":"whole wheat","vegetables":["lettuce","onion","tomato","hot peppers"],"meat":["prosciutto","salami"],"cheese":["provolone"],"condiments":["mayonnaise","vinegar"],"isToasted":false},{"bread":"white","vegetables":null,"meat":["bologna"],"cheese":null,"condiments":null,"isToasted":false}],"total":13.4}]}
We used the JSON.stringify function to convert our object into a string. All fields and values were placed into this string in a format that is readable:
{
"orders": [
{
"orderId": 1234,
"sandwiches": [
{
"bread": "whole wheat",
"vegetables": ["lettuce", "onion", "tomato", "hot peppers"],
"meat": ["prosciutto", "salami"],
"cheese": ["provolone"],
"condiments": ["mayonnaise", "vinegar"],
"isToasted": false
},
{
"bread": "white",
"vegetables": null,
"meat": ["bologna"],
"cheese": null,
"condiments": null,
"isToasted": false
}
],
"total": 13.4
}
]
}
Notice how all fields have been surrounded by double quotes. The quotes can be dropped in JS/TS - but JSON format indicates that fields are surrounded with double quotes. We can then transform this string back into an object using the JSON.parse function:
// JSON.parse(s: string): any
const other = JSON.parse(str);
console.log(other); // { orders: [Object] }
// `other` is an entirely different object than `obj`
The Yelp Dataset
The business review site Yelp releases a large dataset of businesses in a JSON format.
Each entry in the dataset JSON object, and looks something like this:
[
{
"business_id": "Pns2l4eNsfO8kk83dixA6A",
"name": "Abby Rappoport, LAC, CMQ",
"address": "1616 Chapala St, Ste 2",
"city": "Santa Barbara",
"state": "CA",
"postal_code": "93101",
"latitude": 34.4266787,
"longitude": -119.7111968,
"stars": 5.0,
"review_count": 7,
"is_open": 0,
"categories": [
"Doctors",
"Traditional Chinese Medicine",
"Naturopathic/Holistic",
"Acupuncture",
"Health & Medical",
"Nutritionists"
]
},
{
"business_id": "mpf3x-BjTdTEA3yCZrAYPw",
"name": "The UPS Store",
"address": "87 Grasso Plaza Shopping Center",
"city": "Affton",
"state": "MO",
"postal_code": "63123",
"latitude": 38.551126,
"longitude": -90.335695,
"stars": 3.0,
"review_count": 15,
"is_open": 1,
"attributes": { "BusinessAcceptsCreditCards": true },
"categories": ["Shipping Centers", "Local Services", "Notaries", "Mailbox Centers", "Printing Services"],
"hours": {
"Monday": "0:0-0:0",
"Tuesday": "8:0-18:30",
"Wednesday": "8:0-18:30",
"Thursday": "8:0-18:30",
"Friday": "8:0-18:30",
"Saturday": "8:0-14:0"
}
}
// ...thousands of other entries
]
Notice how unorganized and "dirty" the data is. As examples,
- the
attributesfield is absent in the first object, but present with data in the second - the
hoursfield is absent in the first object, but present with data in the second
With thousands of other data entries, it is not hard to imagine there are dozens of other abnormalities in the data.
Provided to you is a version of the Yelp dataset, you may load it by using the loadYelpData(part?: number) function in the ./include/data.ts file. There are other functions implemented in the same file that could be useful, such as creating randomized Businesses with createRandomData(n: number), or creating a dataset with a name, but reusing results from previous runs (loadOrCreate(datasetName: string, createDataset: () => Business[])).
Examine the results of these two functions - look at the fields on each entry and the types they typically hold. Additionally, inside of the ./include/data/ folder, there is a series of JSON files that you can use to inspect the dataset.
Programming Tasks
pairwise
Write a function pairwise that takes in an iterable of generic type T. This function returns an iterable iterator over consecutive pairs of elements from the original sequence. For example, given an iterable that yields (1, 2, 3, 4), the function returns an iterable iterator that yields ([1, 2], [2, 3], [3, 4]). The order of the elements in both iterators must match. This function must use O(1) storage to receive credit. Your implementation may use generators or a custom iterator class.
cycle
Implement the function cycle that takes in an array of iterables. This function returns an iterable iterator that yields one item at a time from each iterable in the order they are given. If any of the iterables reach their end before the others, that iterable is skipped. The returned iterable iterator is finished when all values from the iterables have been exhausted. For example, given iterables [(1, 2), (3)], the returned iterable iterator would yield (1, 3, 2). You may use O(m) space to store iterators, where m is the number of iterables, but your implementation must otherwise use O(1) space. Your implementation must use generators to receive credit.
BusinessQuery class
Create a class BusinessQuery that implements the Iterable interface. The BusinessQuery class has a constructor that takes in one parameter of type Iterable<Business> that is initialized through a parameterized constructor.
The BusinessQuery class must contain the following public methods. Each method must use constant (i.e., O(1)) storage to receive credit.
- A constructor that takes in one parameter of type
Iterable<Business> - A method
filterthat takes in a functionfunc: FilterFuncand returns aBusinessQueryobject that, when iterated over, yields only those businesses for which func(business) returns true. FilterFunc is the type of a function that takes in aBusinessand returns a boolean. - A method
excludethat takes in a functionfunc: FilterFuncand returns aBusinessQueryobject that, when iterated over, yields only those businesses for which func(business) returns false. - A method
slicethat takes in two integersstartandendand returns aBusinessQueryobject that, when iterated over, yields onlyBusinessQueryobjects in the 0-indexed range [start, end). That is, if the currentBusinessQuerywould yield 5Businessobjects, the business query returned by calling.slice(1, 3)would yield only 2Businessobjects (the ones at index 1 and index 2).slice’s parameters have the following requirements:- The start parameter is required
- The end parameter is optional
- Hint: The syntax for declaring a parameter as optional is a question mark (?) after the name. For example, in the function signature
function spam(egg?: string), the parameter egg can be a string if an argument was provided by the caller (e.g.,spam(spam))orundefinedif an argument was not provided (e.g.,spam()).
- Hint: The syntax for declaring a parameter as optional is a question mark (?) after the name. For example, in the function signature
- If start or end (when end is provided) are not integers,
slicethrows an exception of typeSliceErrorwith the messageslice: start and end must be integers.- The
SliceErrorclass is provided for you in the starter code
- The
- If start or end are negative,
slicethrows an exception of typeSliceErrorwith the messageslice: start and end must be non-negative. - If start is greater than end,
slicethrows an exception of typeSliceErrorwith the messageslice: start must not be greater than end - If end is greater than the number of elements yielded by the current (pre-slice) business query, the behavior is identical to the behavior when end is equal to the number of elements yielded by the current query.
- If start is greater than or equal to the number of elements yielded by the current query, the query produced by
slicewill yield no elements when iterated over.
- Any additional methods required to make a
BusinessQueryobject iterable. That is, any methods required by the Iterable interface- Hint:
BusinessQueryis iterable, but it is not itself an iterator. Do not add any unnecessary public methods.
- Hint:
- You may add any private attributes or methods needed for your implementation. Make sure they are declared as private.