Query #1 find the files submitted for a userid/courseid
DECLARE @groupid VARCHAR(40),@courseid VARCHAR(40),@filesizelimit INT
SET @groupid = '%'
SET @courseid = '%'
SET @filesizelimit = '1'
SELECT TOP 100 cm.course_id,g.group_name,f.[file_name],f.file_size,f.dtmodified,ga.score,xy.FULL_PATH as [full path], 'http://BBSERVER/bbcswebdav/xid-'+cast(xy.[FILE_ID] as varchar(10))+'_1' as [file link]
FROM files f WITH (nolock)
INNER JOIN group_attempt_files gaf WITH (nolock) ON f.pk1 = gaf.files_pk1
INNER JOIN group_attempt ga WITH (nolock) ON gaf.group_attempt_pk1 = ga.pk1
INNER JOIN groups g WITH (nolock) ON ga.groups_pk1 = g.pk1
INNER JOIN course_main cm WITH (nolock) ON g.crsmain_pk1 = cm.pk1
INNER JOIN cms_doc.dbo.XYF_URLS xy WITH (NOLOCK) ON xy.FILE_NAME COLLATE SQL_Latin1_General_CP1_CS_AS = f.[file_name]
WHERE g.group_name LIKE @groupid AND cm.course_id LIKE @courseid
Query #2 - find the file location in the CS storage
If you DIDN'T have CS then there actually would be a physical /internal/courses folder on your file system. Files uploaded to the Content System, however, are stored in dated subfolders of the respective content area's root folder, such as: /storage/2015/2015-04/2015-04-16/44201/114902.6 – only the database permits tying the file's virtual path (as seen in the GUI) or its unique ID (xid) together with this actual physical path on disk.
What is the filestorage name 114902.6? The first part 114902 is the [XYF_BLOBS].[BLOB_ID] and the .6 refers to the cms_doc database as referenced in [cms].[dbo].[XY_FILE_SYSTEMS].
So 114902.6 would be referenced in [cms_doc].[dbo].[XYF_BLOBS].[BLOB_ID] = 114902
DECLARE @courseid VARCHAR(40)
SET @courseid = '%'
SELECT xu.FULL_PATH, xu.FILE_NAME, xf.MIME_TYPE, xf.FILE_SIZE, xb.STORAGE_FILENAME, xb.BLOB_SIZE
FROM XYF_URLS xu, XYF_FILES xf, XYF_FILE_VERSIONS xv, XYF_BLOBS xb
WHERE xu.FULL_PATH like '/internal/courses/'+@courseid+'/attempt/%'
and xu.FILE_ID = xf.FILE_ID
AND xf.FILE_ID = xv.FILE_ID
AND xf.LATEST_VERSION = xv.VERSION
AND xv.BLOB_ID = xb.BLOB_ID;
Note: CS also does a process called de-duplication, where it figures out if it already has an IDENTICAL of a file, to only keep references to the original file to save space.
In this example, file1.pdf is a sample I've used many times before so the original is stored in /2010/2010-04/2010-04-23/3601/11126.6 even though I "just" uploaded in 2015!
Meanwhile, UnderstandingStorage.pdf is a unique, new file I just uploaded today (2015-04-16) so it is stored in /2015/2015-04/2015-04-16/44201/114902.6
When you clear an assignment attempt
Blackboard will a) remove the database entries (so the two queries will no longer return records for that particular attempt) AND b) remove the FILE_ID from storage (if it is unique). If you went back into storage, you would find an empty folder.
E.g., If we deleted the attempt that had the file UnderstandingStorage.pdf, then /2015/2015-04/2015-04-16/44201/ would still be there, but no 114902.6
Thus, any true recovery of a deleted attempt would be both at the database AND filesystem level.