C programming assignment

Get perfect grades by consistently using our affordable writing services. Place your order and get a quality paper today. Take advantage of our current 20% discount by using the coupon code GET20


Order a Similar Paper Order a Different Paper

C programming assignment

C programming assignment
Homework 6 – SHORT VERSION – If you submit this you cannot make an A in the class even if your class score is 100. Points: 100 points + 20 bonus (100 points are used as the full homework score. 20 points are bonus. They will be added to the total homework points when computing the average. The bonus points do not come from specific components. Anything above 100 points will help your other homewor ks.) Topics: Dynamic programming: Edit Distance (compute the distance and print the table to verify correctness); sort dictionary and use binary search; use library function qsort; use function pointers; extract words from text; given a word find the most similar words in the dictionary; fix words, reconstruct text . Watch the Canvas video that presents this homework (under the Canvas entry for this assignment). Program requirements (120 points) Unless otherwise specified, you should always assume that ev ery function you are asked to implement has to work for all sizes and variations of the data . This assignment has 2 main parts: • Part 1: Allow the user to repeatedly enter pairs of words to compute the edit distance for. The code that reads the words and calls the edit distance is provided in spell_checker.c. It will work once you implement the edit distance function. The edit distance function will build and print the table for the edit distance and also the distance itself. It should be CASE SENSITIVE. E ditDistance(“dog”,”DoG”) = 2 (I consider it important to be able to print the data from your program in a formatted, easily readable way that allows you to easily check and verify that the program does what you want it to do. Printing the table for the edi t distance along with the indices and corresponding letters from the strings allows you to check that the program generates the same table as we did in class or for any other test case you develop on your own on paper.) • Part 2: a. Read a dictionary and test filenames. See sample run. b. Load the words from the dictionary in an array, dict. c. Sort the dictionary array, dict, in increasing alphabetical order d. For each word in the test file,: i. Print it to the screen. Put two vertical bars around it to be able to tell if you read any extra space or not with the word. E.g. print | Can | , not just Can . ii. make a copy with all lowercase letters and iii. search for this lowercase word in the sorted dictionary array using binary search. (All the words in the dictionary have all low ercase letters) Keep the count of how many words were touched -on during binary search (or how many times the loop for binary search executed) and print it. If in verbose mode, print the dictionary words that were used during binary search. 1. If the word is found, it means the spelling is correct. PRINT IT TO THE SCREEN. 2. If the word is not found, give these options to the user as to what correction to be used for this word in the output file -1 – user will type the correct spelling 0 – leave the word as i s (do not apply any correction) iv. For option ( -1) read a word from the user (this would be the correction the user gives) for 0 keep the word as is. v. Print the currently correct word (this depends on what option the user chose) e. Allow verbose and non -verbose mode. In verbose mode you print: 1. The dictionary BEFORE sorting 2. The dictionary AFTER sorting it 3. The dictionary words “touched” by binary search as you are searching for the current test word. • You will write all your code in the file spell_short_B.c (provide d). A client file, spell_checker.c, is also provided. It implements the high level behavior of the program and calls specific functions that you must implement in spell_short_B.c. • Details: 1. Implement the Edit Distance between 2 strings as shown in class. It must be the BOTTOM -UP DYNAMIC PROGRAMMING method (i.e. the one that has NO recursion). Simply write the loop(s) to populate the 2D table. – 20 points 2. 3. Dist(0,0) = 0 4. Dist(0,j) = j 5. Dist(i, 0) = i 6. Dist(i,j) = Dist(i -1,j -1) if x i-1 = y j-1 7. 1 + min { Dist(i -1,j), Dist(i,j -1), Dist(i -1,j -1) } if x i-1 ≠ y j-1 8. It should be CASE SENSITIVE. EditDistance(“dog”,”DoG”) = 2 9. Print the distance matrix as a formatted table. – 25 point s 10. Allow the user to repeatedly compute the edit distance between pairs of words given as input. It stops when the user enters -1 -1 . (Implementation already provided) 11. Spell check. 1. If user selects verbose mode, print the dictionary words before and afte r sorting and also the words touched -on during binary search. It should match the sample output perfectly: index number and word. 2. load a dictionary file. Sort the data in the file in alphabetical order. (if verbose mode, print the dictionary before and af ter sorting.) You can assume all the words in the dictionary are in lowercase. I STRONGLY encourage you to use the qsort function from the C library. E.g. read http://www.cplusplus.com/refer ence/cstdlib/qsort/ . Note that the compar function that it uses (and you need to write) takes POINTERS to whatever type of data is in your array. That means that if your array already has pointers, that function takes pointers to pointers. It may take a bi t of careful trial and error, but it is worth the price to learn to use the qsort function. The compar argument is a function pointer . If you are not familiar with it you can read here https://www.geeksforgeeks.org/function -pointer -in-c/. Based on how you store the dictionary words (array of pointers or 2D array of chars), it may be a bit tricky to set up the compar function, or to give the correct size of the elements for the qsort functi on. If you write a good function to be passed for the compar argument, qsort will work and you do not need to implement a sorting function. This is a great opportunity to learn how to use a library function, as opposed to writing everything ourselves. Func tion pointers are also so cool… 3. You will NOT compute the edit distance from every test word to every dictionary word, but you must calculate the time complexity. Calculate the worst case time complexity to compute the edit distance between the misspelled words and every dictionary word. Assume there are T misspelled words in the text file, D words in the dictionary and that each word can be at most MAX_LEN chars. What is the time complexity to compute the edit distance from each test word to each dictionary word? Since the word length can vary, you should assume the worst case, that is, assume that every test word and every dictionary word is size MAX_LEN. What is the Θ for this worst case scenario? Give your answer as a func tion of T,D and MAX_LEN. For example if T = 10 words and D = 222 dictionary words, and MAX_LEN =100 you would assume that each of those (10+222) words has 100 characters. Write the time complexity at the top of your file as a comment. (You do not need to w orry about the time to read the words from files. Assume they are already in memory for this calculation.) 4. Calculate the time complexity to search for a word in the dictionary (to see if it is correctly spelled) using binary search. Assume the worst case: the word is not found, and all the words have MAX_LEN. Given files (different from the normal version of Hw5): 1. Grading criteria short B 2. list1.txt – text file (with phrase that includes misspelled words). 3. list0.txt – text file with only 2 words for easy debugging. 4. spell_short_B.c – write your solution code here. This has the same big functions called from spell_checker.c, but I renamed it to avoid confusion between this and the solution for the normal problem. 5. run_dsmall_list0_1.txt – Sample run for par t 2 (skips part 1 by giving -1 -1 right away) uses the small dictionary and text file and verbose mode. See how the dictionary file is sorted and the words ‘probed’ by binary search. E.g. when searching for “you” it compared it with dictionary words: “said “, “use”, “when”, “your”). There is no output file for this version. 6. run_dsmall_list0_0.txt – Same as above but runs the Non -verbose mode. 7. run_dmed_list1_0.txt – Sample run for part 2 (skips part 1 by giving -1 -1 right away) uses the medium size dictionary, dmed.txt , and the non -trivial tex t file, list1.txt and the non -verbose mode (it does not print the dictionary words 8. run_dmed_list1_1.txt Same as above (dmed.txt and list1.txt) but in verbose mode. Files common with normal hw 6: 9. spell_checker.c – do not modify 10. spell.h – do not modify 11. redir1.txt – file to be used for input redirection for part 1. It gives bad file names for dictionary and text so that part 2 will not run. Here is the sample run, run_redir1.txt , when run with valgrind — leak – check=full ./a.out < redir1.txt 12. redir_100_4.txt – file to be used for input redirection for part 1. It contains words of max length (100). Here is the sample run for it: run_redir_100_4.txt 13. dictionary files of 3 sizes (small, medium, big): dsmall.txt , dmed.txt , dbig.txt dmed.txt is a slight modification of the 1 -1000.txt file from https://gist.github.com/deekayen/4148741 . dbig.txt is taken from https://github.com/first20hours/google -10 000 -english/blob/master/google -10000 -english.txt Every dictionary file starts with the number of words it contains.: 14. 15. N 16. word1 17. word2 18. … 19. wordN See dsmall.txt. It has 10 words and has number 10 at the top. You can assume there will be at least 1 word in the file (so N will be at least 1). You can also assume N will be an integer. (You do not need to deal with the cases where it is not an integer.) You can also assume that all the words in the dictionary are lower case. Requirements: 1. The provided files are given with the Unix EOL (end -of-line) Since your program will run on a Unix system (omega or the VM) it will have to run with this type of file s . 2. All the data must be passed as arguments. There cannot be any global variables. 50% of task credit lost if an array (1D or 2D) or other data is a global variable instead of being passed as an argument. 3. The program must not have any memory errors when ran with Valgrind. 4. The table showing the cost must show the numbers aligned exactly as shown in the sample run: 3 spaces for number display in each cell, horizontal bars (|) between cells and the horizontal line (of dashes) between rows. 5. You can assume that any string is at most 100 characters long. That includes: the file names and all the words (from the dictionary and text files and given by the user). 6. You should not need any of these, but in case you do, you can assume there will be at most 1000 words in the paragraph (input file with text) and that the paragraph will have at most 100000 characters. 7. You can assume that there is a new line after the las t word in both the test and the dictionary files. 8. Do not modify the signature of any of the functions given in the .h file. 9. In order to implement the required two methods, you are encouraged to write other functions to do smaller parts of the work. Since they are only used by functions in spell.c, you do NOT need to put their signatures in the .h file. You simply write them in the spell.c file at the top. 10. Do not hardcode file names or the number of words/lines in files. By hardcode I mean using a specific number in the code (e.g. 1003). It is **NOT** hardcoding to read the first number in the file (e.g. read 1003 for dmed.txt), store it in a variable (e.g. N) and use that variable as needed (e.g. to allocate space, or control how many the number of loop it erations). You can hardcode the maximum sizes, if needed. You CAN hardcode the size of the string that stores a word read from the file based on the maximum word length given in the specifications (but remember the +1 needed for the string ending symbol ‘ ’). Suggestions: 1. DO NOT leave it for the last few days. There are independent components that you can work on. Make sure that you can at least read the data as you cannot move on if you are stuck on that. 2. You can work on the table display before you im plement the edit distance (you can print 0 in every cell). 3. You can work on reading the dictionary and the test file before you implement the edit_distance function. 4. Run your code with both input redirection AND also user input (in separate runs, not in t he same one). Extra resources for programming components needed for this assignment: 1. Plan how and what data you will store, what data you will compute (and store it or not), and then write the code to match your plan. E.g. will you store the dictionary words(if so in what type of array)? Will you store the test words (if so in what type of array )? Once you compute the distance of one test word to every dictionary word, will you store that? 2. I recommended you store the distances. You could recompute them, but that would almost double the runtime of your program since computing the edit distances is the most time consuming component of the program (excluding reading from files). 3. After you find the smallest distance between the test word and any dictionary word, go back and identify the words that have that distance from the test word and print them. 4. If you need to read -up on passing 2D arrays as arguments in C you can check this page: https://www.geeksforgeeks.org/pass -2d -array -parameter -c/ . 5. If you want to store the dictionary as an array of pointers, see the graph allocation for the 2D matrix for edges. It is not identical, but it has some similarities. 6. If you want to write other f unctions to help you do accomplish the work, you can write those in spell_short_B.c. They do not need to be part of the header file since they will only be used in functions from the spell_short_B.c file and not from the client file. You can just declare t hem locally at the top or simply define them before they are used. 7. To see an example for reading from a file, see the examples1.c code posted on the Scans page under our second lecture of the semester. It reads integers, but the process is similar for str ings . 8. When reading from the file, the first entry is the number of words, and thus you know how many strings you will be reading and how much space to allocate. 9. If you allocate the exact space to hold a string, remember to allocate one extra char in add ition to the max length to hold the ‘ ’. 10. To print an integer on 5 reserved spaces you should use printf and specify the minimum width for printing. E.g. see how the numbers printed by the following printf statements print the numbers aligned because 5 sp aces are reserved (and so even the ‘shorter’ numbers use 5 spaces and they align well with the ‘longer’ numbers. (Note that this is just an example, not the homework solution. Make sure you use the number of spaces required in the homework for yoru solutio n.) printf(” -%5d -“,16); printf(” -%5d -“,3976); printf(” -%5d -“,2); printf(” n”); printf(” -%5d -“,8257); printf(” – %5d -“,8); printf(” -%5d -“,52); 11. In order to print the horizontal line of dashes, you should count how many cells are in a row and what the width o f one cell is. Remember to count 1 for the “wall” (i.e. “|”) of the cell. Should you count one wall or two walls for each cell? Test you calculations for a small table like the one for “cs” and “cat” in the sample output. 12. If you are still struggling with printing a formatted table, you can work on the following helper problem (with a classmate or on your own). Working on this will NOT be considered collusion. Problem: read a word from the user and print it repeatedly in a table of 3 rows and 4. E.g. 13. 14. Enter a word: cat 15. + cat+ cat+ cat+ cat+ 16. ~~~~~~~~~~~~~~~~~~~~~~~~~ 17. + cat+ cat+ cat+ cat+ 18. ~~~~~~~~~~~~~~~~~~~~~~~~~ 19. + cat+ cat+ cat+ cat+ 20. ~~~~~~~~~~~~~~~~~~~~~~~~~ Another sample run: Enter a word: cs + cs+ cs+ cs+ cs+ ~~~~~~~~~~~~~~~~~~~~~~~~~ + cs+ cs+ cs+ cs+ ~~~~~~~~~~~~~~~~~~~~~~~~~ + cs+ cs+ cs+ cs+ ~~~~~~~~~~~~~~~~~~~~~~~~~ How to submit Submit only the spell.c file. You can submit other file(s) with your own test cases if you develop any, but your code will be tested with the posted data files and other test files we prepare. Include the compilation instructions (especially if they differ from the one shown on this p age), but there should not be any other .c or .h files. The assignment should be submitted via Canvas . As stated on the course syllabus, programs must be in C, and must run on omega.uta.edu. IMPORTANT: Pay close atte ntion to all specifications on this page, including file names and submission format. Even in cases where your answers are correct, points will be taken off liberally for non -compliance with the instructions given on this page (such as wrong file names, wr ong compression format for the submitted code, and so on). The reason is that non -compliance with the instructions makes the grading process significantly (and unnecessarily) more time consuming. Contact the instructor or TA if you have any questions. Bac k to Homework page.

Have your paper completed by a writing expert today and enjoy posting excellent grades. Place your order in a very easy process. It will take you less than 5 minutes. Click one of the buttons below.


Order a Similar Paper Order a Different Paper