I've added two new methods to my Amazon S3 CFC: listBuckets and listObjects. Both of them do about what you'd expect, returning a CFDIRECTORY-esque recordset object containing the rows you are interested in. I've attempted to make S3 appear like a "normal" filesystem where "/" is S3 itself, the top-level directories are your buckets, and your objects are below that. At the moment no consideration is made for paging or truncation. Leveraging the new functionality, here's complete source for a simple S3 browser (minus your key/secret):
<cfparam name="url.path" default="" /> <cfset s3 = createObject("component", "amazons3").init( "YOUR_AWS_KEY", "YOUR_AWS_SECRET" ) /> <cfoutput> <cfset bp = "" /> <h1> <a href="?path=#bp#">ROOT</a> <cfloop list="#url.path#" index="segment" delimiters="/"> <cfset bp = listAppend(bp, segment, "/") /> / <a href="?path=#bp#">#segment#</a> </cfloop> </h1> <cfif url.path EQ ""> <cfset b = s3.listBuckets() /> <ul> <cfloop query="b"> <li><a href="?path=/#name#">#name#/</a> #dateLastModified#</li> </cfloop> </ul> <cfelse> <cfset q = s3.listObjects(listFirst(url.path, '/'), listRest(url.path, '/')) /> <ul> <li><a href="?path=#reverse(listRest(reverse(url.path), '/'))#">..</a></li> <cfloop query="q"> <li> <cfif type EQ "dir"> <a href="?path=#listAppend(directory, name, '/')#">#name#/</a> <cfelse> <a href="#s3.s3Url(bucket, objectKey)#">#name#</a> </cfif> </li> </cfloop> </ul> </cfif> </cfoutput>
The default mode of operation assumes a delimiter of '/' (just like a filesystem). If you want to do non-delimited operations (like generic prefix matching), you'll want to supply an empty delimiter, or you'll get weird results. For example:
<cfset k_objects = s3.listObjects('my-bucket', 'k', '') />
If you omit the third parameter, the default 'k' will be used, and you'll get back objects within the 'k' psuedo-directory, rather than objects that begin with a 'k'. This is the reverse of the default position of the raw S3 API, which assumes you want simple prefixing and makes you explicitly add the delimiter if you want psuedo-directory contents.
This dichotomy can also lead to weird results in the resulting recordset. Every recordset comes with both 'bucket' and 'objectKey' columns that match the raw S3 nomenclature and 'directory' and 'name' columns that match the filesystem "view" of S3. If you're doing raw prefixes you'll want to use bucket/objectKey (as the directory/name semantic doesn't work with prefixes). If you're doing filesystem type stuff you'll probably want directory/name (though bucket/objectKey will still be correct).
Feature requests… need for real-world use.
More than bucket listings, when I upload content to S3/Cloudfronts I need to set custom upload headers (expires, no-cache) and then want to change the ACL so the content is public. Also if I upload other content to replace existing content, I often want to rename the parent; this is a form of versioning to force browsers to reload non-expiring cached content.
Your CFC would be fantastic if it would set headers and ACL and rename if needed.
Also I would want to upload an entire directory or pass a zip with an "uncompress" option so the contents would be uploaded to S3 in a hierarchial fashion.
This would fantastic.
Hi,
I am putting an object on S3, but the object permission is not being set to "public-read" as defined in the function. I am using the following function:
s3.putObject(variables.s3location,res.serverFile,res.contentType,'300')>
I do not see anywhere in the above function to define permissions other than the one set withing s3.cfc.
Just wondering why object would not get "public-read" permission.
Sanjeev,
I think you're using http://amazons3.riaforge.org/, not my S3 CFC (at least judging by the method name and parameter order). My CFC has a boolean that you pass in for controlling whether something is public or not (i.e. has the public-read ACL), but I don't know anything about Joe's RIAForge project's capabilities.
Hey Barney,
Would you happen to have a sample piece of code implementing your S3 CFC to build a URL for "authenticated links to secured assets for embedding in pages"? Thanks!
Steve,
In the little browser above, you'll see this line (7-8 from the bottom), which does what you want:
Just create the CFC instance (with your id/secret), and then pass the bucket and objectKey to the 's3Url' method. You can also supply requestType and timeout as the third and fourth parameters. They default to 'vhost' and '900' respectively. If you don't have a CNAME set up in DNS, you'll need to at least pass 'regular' as the third parameter.
Thank you for the CFC. I am having some issues with the CFLOOP tag resulting in an Attribute error saying that ARRAY is not valid but I cannot see that you are using that in the tag. Have you seen this error or have any suggestions?
Regards,
-David
David,
You must be on CF7, right? The 'array' attribute was added in CF8. It's used within listBuckets and twice within listObjects. If you don't need those methods, you can just remove them. If you do need them, porting is simple:
is equivalent to:
If you make the mods, shoot me a patch file and I'll incorporate it into the next release.
Awesome, really! You responded immediately and are very supportive with your suggestions. Thanks! I will send you a revised file.
How do you delete a 'folder'? does not seem to do anything if there are objects within that folder.
Jules,
You can't; S3 doesn't have any concept of containment. Every object is a top-level object which may or may not share a common prefix containing slashes with other objects, so you delete a "folder" by individually deleting each object with that "folder" as a prefix in it's object key.
Most S3 managers will use slashes as a delimiter to emulate browsing folders within your S3 bucket, but it's a UI construct, not something that S3 is actually aware of.
Hope that helps.
Yes I realize that. That's why I put quotes around "folder" and called it "psuedoDir". What is the workaround then?
My first thought is to edit your deleteS3File function. First get the list of objects in the bucket. Loop through them looking for a match of listGetAt(name,1,'/') EQ pseudoDir. With each match call deleteS3FileInternal(bucket, name, 0).
Seem resource intensive though. I was just hoping there was a smarter way to do it that I overlooked.
Jules, you are correct. You must delete each individual object with a separate DELETE request. So yes, the answer to your question is exactly what you propose, do a LIST request with a prefix of your "folder" name plus a trailing slash, and then delete every object that comes back (and make sure you consider paging if you have more than 1000 objects with the prefix).
That stinks. AS3 should have a wildcard function instead. DELETE #psuedoDir#/* would be nice. Anyway… here's my contribution, in case anyone needs to do the same.
<cffunction name="deleteS3File" access="public" output="false" returntype="void">
<cfargument name="bucket" type="string" required="true" />
<cfargument name="objectKey" type="string" required="true" />
<cfset q = application.as3.listObjects(bucket, ", '?') />
<cfloop query="q">
<cfif listGetAt(q.name, 1, "/") eq objectKey>
<cfset deleteS3FileInternal(bucket, name, 0) />
</cfif>
</cfloop>
</cffunction>
Hi Barney,
Thanks for the cfc it is really excellent am still playing with it. One thing I noticed is that the list object only returns a 1000 items. Is that normal? I am using CFMX Ver 7.
Many thanks,
Gaurav
Gaurav,
Yes, the 1000-row limit is expected. The S3 service returns a maximum of 1000 record per LIST request. If you want another "page" of records, you have to rerequest the listing and add the 'marker' parameter, which is an objectKey to start after.
I intentionally didn't make the 'listObjects' method automatically go through and build a complete list of every object within a bucket. It just returns the first page. However, I also haven't implemented a way to supply 'marker' to the method if you need a subsequent "page" of record. You can, however use the 'prefix' argument to request a specific objectKey, if you want the metadata about an object.
Hi Barney,
Thanks for that. So how do you suggest to scroll via the ListObjects methods? If I know that listObjects method gets the 1000th item, can I then ask for the next 1000th item or do I have to explictlly refer to the 1000th Object and then start from there. Sorry for the details.
Is there any way the 1000 records can be made into a argument, so when using the method you can choose how many records to bring back.
Gaurav,
1000 is a hard limit imposed by Amazon, not something I control. Nor is it possible to specify a numeric offset – you have to specifically refer to the 1000th object when requesting objects 1001-2000. Adding a 'marker' parameter to the listObjects method would be straightforward: if present, just pass it through to S3 (ensuring it gets included in the signature string, of course).
That said, it'd probably be more convenient to have the method expose "normal" offset/limit parameters, and then it'd be in charge of figuring out how to query for them with 'marker', so that logic doesn't have to be in the application. However, this could potentially be expensive, as every invocation would have to start at the top and scroll down until it found the desired records. I.e., listObjects(…, limit: 50, offset: 1400) would require quering for the first 1000, throwing them away, and then querying for 1001-2000. In most cases it probably doesn't matter, but it's something to consider if the CFC were to adapt the semantics.