資料結構與演算法：打好基礎，面向未來

研究所考試

資料結構與演算法：打好基礎，面向未來

發佈時間：20241230

對於資工、資管所考生而言，此科目為必考的重要學科。而在學習程式設計與求職的過程中，資料結構與演算法更是不可或缺的核心能力。

我們考研究所是為了拿到好的學歷讓就業更順利，而無論未來想成為軟體工程師、資料科學家，還是探索人工智慧的奧秘，資料結構與演算法都是「語言獨立」且「實踐導向」的技能。不僅幫助你撰寫高效程式，還能在面試中脫穎而出。

科技業找工作上，有個專有名詞叫做「刷題」，也就是透過反覆練習知名的資料結構演算法題庫網站(如Leetcode、HackerRank)找到很好的工作。目前頂尖的外商如Google、Nvidia、Meta(原本名稱為Facebook)、Amazon或者國內知名台商聯發科、台積電等等，都會被問到資料結構演算法的題目。需要在面試期間透過面試官實際提出的情境，透過Google doc或白板寫Pseudo code或實際撰寫資料結構演算法的程式，能夠有執行效率以及清晰邏輯才能通過面試。無論是剛畢業找Junior level的工作，還是剛畢業有工作經驗3-5年，想要進到這些知名公司。

什麼是資料結構？

資料結構是組織和管理資料的一種方式，它決定了如何有效存取、操作以及儲存資料。常見的資料結構包括陣列 (Array)、鏈結串列 (Linked List)、堆疊 (Stack)、佇列 (Queue)、樹 (Tree)、圖 (Graph) 等。不同的資料結構適合不同的應用場景，比如：

陣列：適合隨機存取資料。
堆疊與佇列：適合遵循特定順序的操作，如回溯或排程。
樹與圖：適合階層式資料或關聯式資料的操作。

然而，資料結構離不開演算法。演算法是實現這些結構操作的方法，比如排序、搜尋或最佳化。

Stack 與 Heap：記憶體管理的基礎

在理解資料結構之前，掌握電腦的記憶體管理能讓你更有信心處理程式設計中的問題。兩個核心概念是 Stack (堆疊) 和 Heap (堆積) 。

Stack (堆疊)
- 用途：用於函數呼叫的管理、區域變數的儲存。
- 特性：遵循後進先出 (LIFO) 原則。當函數呼叫時，會將其變數推入堆疊；函數結束後，變數會自動彈出。
- 速度：因為直接由 CPU 支援，堆疊的操作非常快。
- 限制：空間有限，通常大小在編譯時已確定。如果超過堆疊大小，會導致堆疊溢位 (Stack Overflow)。
Heap (堆積)
- 用途：用於動態記憶體分配，比如程式執行時的物件創建或動態陣列。
- 特性：無固定順序，適合靈活分配記憶體。開啟記憶體時需使用 new 或 malloc，釋放時使用 delete 或 free。
- 速度：操作比堆疊慢，因為需要尋址和管理。
- 限制：需要手動管理，若未正確釋放記憶體，可能導致記憶體洩漏。

資工所考古題探討

在這邊我們實際探討台大資工所資料結構與演算法考題。本題是113年台灣大大學此學科的第三題考題，算是近期很新的題目，題目如下：

(5 points) A Dynamic Array (abbreviated as DArray) is a type of array that permits the insertion and deletion of elements at the end to dynamically adjust its size. DArray D maintains three crucial attributes, including the underlying array D.data, the logical data size D.size, and the actual array capacity D.capacity. The invariant D.size ≤ D.capacity holds at any given moment. The Insert(D, x) operation appends the element x to the end of D.data. In case D.size = D.capacity, it triggers a call to Resize(D) before appending x, where Resize(D) allocates a new and larger array, moves all the data to it, increases the value of D.capacity, sets up D.data, and finally frees the old array. Two implementations of Resize(D) are provided: ResizeA(D) increases D.capacity by a constant (e.g., 10), while ResizeB(D) doubles D.capacity each time. To initialize DArray D, we set D.size to 0, D.capacity to 1, and allocate a size 1 array to D.data. Assuming that both allocating and freeing arrays of length n take O(n) time, and moving a single element takes O(1) time, please select the correct description(s) below.
- A. DArray is not suitable for random access because its memory may be relocated.
- B. DArray can only be allocated in the heap memory due to its dynamic memory nature.
- C. If ResizeA(D) is considered, the amortized time complexity of Insert(D, x) is O(1).
- D. If ResizeB(D) is considered, the amortized time complexity of Insert(D, x) is O(1).
- E. In the Delete(D) operation, if D.size > 0, D.size is decreased by one. If, at this stage, D.size < ⌈D.capacity/2⌉, we create a new array d’ with a size of ⌈D.capacity/2⌉, transfer the data to d’, set up d’ as the new data array for D.data, and subsequently free the old array. Then, the amortized time complexity of mixed Insert and Delete operations can be O(1), where Resize(D) can be either ResizeA(D) or ResizeB(D).

動態陣列與本題檢討

本題的動態陣列 (Dynamic Array, DArray) 是基於 Heap 管理的結構。其核心特點是允許在程式執行時動態調整大小，這是 Stack 無法實現的。檢討本題關鍵點如下：

本題的重點解析

初始化
- 動態陣列一開始分配 D.capacity=1，隨著元素插入而動態調整大小。
兩種擴容策略
- ResizeA：每次增加固定容量，導致擴容過於頻繁，攤還時間複雜度為 O(N)。
- ResizeB：每次容量加倍，雖然每次擴容的時間成本高，但攤還時間複雜度為 O(1)。
錯誤選項解析
- 選項 A：動態陣列依然支持隨機存取，並且其時間複雜度為 O(1)，所以 A 是錯誤的。
- 選項 C：固定增量 ResizeA導致攤還時間複雜度為 O(N)，所以 C 是錯誤的。
- 選項 D：加倍增量 ResizeB 保證了攤還時間複雜度為 O(1)，所以 D 是正確的。
- 選項 E：混合插入和刪除的攤還時間複雜度無法保證 O(1)（特別是使用 ResizeA），所以 E 是錯誤的。